public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: Frederico Silva Correa <fscorrea@inf.ufrgs.br>,
	libc-alpha@sourceware.org
Subject: Re: clone() and Glibc
Date: Fri, 31 Mar 2023 09:52:53 -0300	[thread overview]
Message-ID: <067bc0d9-6329-7c85-0bba-8615b5b99154@linaro.org> (raw)
In-Reply-To: <9b0e446e29a20f2ad1903e5cf681bffa@inf.ufrgs.br>



On 30/03/23 20:35, Frederico Silva Correa via Libc-alpha wrote:
> Dear developers of the GNU libc:
> 
> At the quality of a novice, recently graduated in CS (despite familiarized with both C and C++), I found myself puzzled by a little issue.
> 
> Glibc provides a wrapper for the clone() system call, which receives parameters in the following order:
> 
> - a pointer to the function "func" to be run by the child thread;
> - a base address for the child stack (we'll be back here), since I'm passing CLONE_VM, therefore sharing memory thus unable to reuse the parent thread's stack addresses;
> - 0x100 or the flag CLONE_VM;
> - a pointer to the arguments to be passed to "func" and run with the child thread.
> 
> Automatic storage local variables are usually placed into the stack, which remains more or less a fixed value, decided when the application is run, is that correct?
> Very well. What, then, would be a stack frame whose "base address" I myself malloc'd (at the HEAP) then arbitrarily decided that space --- again, a priori in the heap --- to be treated like a stack frame pertaining to the child thread.
> 
> How am I supposed to interpret all of this? The space allocated at the heap need to be freed in the first place? Is this so-called "stack" on the heap? Or is it a regular stack frame? What about the potential threatens related to e.g. ret2plt and format string attacks?
> 
> I'd be very pleased to have these questions clarified, both as a language enthusiast and as an user with security worries.
> 
> Thanks in advance.
> 

You can check clone usage within glibc on the posix_spawn implementation [1]
and on pthread_create [2].  It seems that your described user case seems to 
something alike pthread_crate, and it either explicit allocate the thread 
stack or get it from pthread_attr_t attribute.  The code is somewhat complex 
[2] because it tries to maintain a cache of allocated stack, but you check
the 'allocate_stack' function where is the main logic.

And both posix_spawn and pthread_crate, the code creates a fixed size stack
(posix_spawn takes in consideration the passed args plus some slack while
the pthread_create is configurable).  The stack is marked as non-executable
as per ABI (if PT_GNU_STACK is set), and a guard page is added to catch
overflow (assuming -fstack-clash-protection).

And for posix_create, an user allocated stack can be freed after the thread
terminates (either by calling pthread_exit, ended after the function exist,
or with pthread_cancel), and it is up to caller to deallocate the thread stack.
This get murky with detached threads, where is UB to call pthread_join to
check if the is still active (glibc returns EINVAL in this case and with an
user allocated stack it means that it won't be reused).

I am using the pthread code as example to show that clone with CLONE_VM
is *really* tricky and not really meant to be used in generic code that aims 
to work with along with C runtime.  You actually use stack with clone you will 
need a way to synchronize the thread end of execution: on pthread_create we
use CLONE_CHILD_CLEARTID, while for posix_spawn is simpler because it
used CLONE_VFORK (the caller thread will stop execution until the callee
thread executes).

[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/spawni.c;h=bc321d4c5879fba178ae4fb3f6e18eeb10ad0a72;hb=HEAD#l339
[2] https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/pthread_create.c;h=a3619da1e216190bb4679936e105d418f683222a;hb=HEAD#l297
[3] https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/allocatestack.c;h=c7adbccd6fc9ae99e6777034443c53a0224c6b1c;hb=HEAD

      reply	other threads:[~2023-03-31 12:52 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-30 23:35 Frederico Silva Correa
2023-03-31 12:52 ` Adhemerval Zanella Netto [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=067bc0d9-6329-7c85-0bba-8615b5b99154@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=fscorrea@inf.ufrgs.br \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).