public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
* Stacksize for nss modules called initiated by getaddrinfo_a
@ 2020-07-08 20:59 Hans van den Bogert
  0 siblings, 0 replies; only message in thread
From: Hans van den Bogert @ 2020-07-08 20:59 UTC (permalink / raw)
  To: libc-help

Hi list,

- Straight to the point question:

Why can't I allocate a buffer more than ~80kB of stack space in a nss 
module when it's initiated by getaddrinfo_a, whereas the blocking 
variant, getaddrinfo, has no problem with that. Shouldn't a couple of 
hundreds of kBs on the stack be peanuts?

- Environment:
At least reproducible with libc 2.27 (ubuntu 18.04)
Also reproduced with ubuntu 16.04 and 20.04


- How I came to the above question:

I am using the nss-docker library. In the past I've had to edit this 
library and increase some buffer too 100kB for holding a json response 
from a docker daemon. This happens on the stack, perhaps not the best 
thing, as only in hindsight I found out that 100kB is kind of candidate
for the heap.

However I upstreamed the patch and all was well. Until I started using 
Emacs 26 which introduced async connections (including the async dns 
resolution.) Emacs would fail with a sigsegv as soon as a plugin would 
use these new async connection builder functions.

Having had zero problems so far for months with nss-docker, I blamed 
Emacs. However doing some investigation myself, I saw that the the 
sigsegv would happen in nss-docker at `_nss_docker_getaddrbyname3_r`.
C debugging has been a while, so the sigsegv at a function entry was 
kind of a challenge and at that point I completely forgot that I every 
changed the buffer size in that function - and gdb said that error 
happened on the line of the function declaration, not at the 
`char[102400]` line (naive me).
Only after revisiting Uni curriculum stuff like calling conventions, I 
realized that the compiler already allocates the buffer on the stack, 
and only then (at runtime) the getaddrbyname3_r(...) params  are being 
copied from registers to the stack -- and that was the point that the 
sigsegv would happen. Lowering the buffer size to 50k resulted in a 
perfectly working library.

Only then I realized I hit this mystical type of fault of which a 
popular programmer website is called after.

I hope my analysis and conclusions are valid, and I hope that someone 
can explain this.

Thanks,

Hans





^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-07-08 20:59 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-08 20:59 Stacksize for nss modules called initiated by getaddrinfo_a Hans van den Bogert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).