Adhemerval Zanella Netto writes: > On 13/04/23 13:23, Cupertino Miranda wrote: >> >> Hi Wilco, >> >> Exactly my remark on the patch. ;) >> >> I think the tunable is benefitial when we care to allocate hugepages for >> malloc, etc. But still be able to force small pages for stack. >> >> Imagine a scenario were you create lots of threads. Most threads >> barelly use any stack, however there is one that somehow requires a lot >> of it to do some crazy recursion. :) >> >> Most likely the heuristic would detect that hugepages would be useful >> based on the stack size requirement, but it would never predict that it >> only brings any benefit to 1% of the threads created. > > The problem is not find when hugepages is beneficial, but rather when > using will incur in falling back to default pages. And re-reading the > THP kernel docs and after some experiment, I am not sure it is really > possible to come up with good heuristics to do so (not without poking > in khugepaged stats). > > For instance, if guard size is 0 THP will still backup the thread stack. > However, if you force stack alignment by issuing multiple mmaps; the > khugepaged won't have available VMA and thus won't use THP (using your > example to force the mmap alignment in thread creation). > > I think my proposal will end with very limited and complicated > heuristic (specially because khugepaged have various tunable itself), > I agree that the tunable is a better strategy. > >> >> Regards, >> Cupertino >> >> >> >> Wilco Dijkstra writes: >> >>> Hi Adhemerval, >>> >>> I agree doing this automatically sounds like a better solution. >>> However: >>> >>> +static __always_inline int >>> +advise_thp (void *mem, size_t size, size_t guardsize) >>> +{ >>> + enum malloc_thp_mode_t thpmode = __malloc_thp_mode (); >>> + if (thpmode != malloc_thp_mode_always) >>> + return 0; >>> + >>> + unsigned long int thpsize = __malloc_default_thp_pagesize (); >>> + if ((uintptr_t) mem % thpsize != 0 >>> + || size % thpsize != 0 >>> + || (size - guardsize) % thpsize != 0) >>> + return 0; >>> >>> Isn't the last part always true currently given the guard page size is based on >>> the standard page size? IIRC the issue was the mmap succeeds but the guard >>> page is taken from the original mmap which then causes the decomposition. >>> >>> So you'd need something like: >>> >>> || guardsize % thpsize == 0) >>> >>> Ie. we return without the madvise if the size and alignment is wrong for a huge >>> page or it is correct and the guardsize is a multiple of a huge page (in which >>> case it shouldn't decompose). >>> >>> + return __madvise (mem, size, MADV_NOHUGEPAGE); >>> +} >>> >>> Cheers, >>> Wilco