public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Dennis Filder <d.filder@web.de>, Libc-help <libc-help@sourceware.org>
Subject: Re: LD_PRELOAD wrappers for system calls and stdio
Date: Thu, 9 Sep 2021 07:51:50 -0300	[thread overview]
Message-ID: <9fb7adcd-c840-6873-0e27-23993a7c9f0d@linaro.org> (raw)
In-Reply-To: <YTPCXU8e3wCBph7z@reader>



On 04/09/2021 16:00, Dennis Filder via Libc-help wrote:
> Hi,
> 
> I'm trying to write an LD_PRELOAD hack for the purpose of
> tracing/logging.  The goal was to get a time-stamped copy of
> everything that is ever read/written over a selected set of file
> descriptors and log it to another file descriptor.  A second goal was
> achieving a high degree of portability.
> 
> My naive hope was that I could implement this by wrapping just a
> handful of system call wrappers and be done with it.  Imagine my
> surprise when in the process of coding that up (under Linux) I came to
> notice that not all stdio functions use write() internally to actually
> write data to a file descriptor.  Some, e.g. fwrite(), do something
> esoteric which involves book-keeping with a FILE object and also an
> apparently in-lined invocation of syscall(SYS_write, ...).  If my
> understanding is correct then this means that it is literally
> impossible to attain my goal by wrapping only the system call wrappers
> which would leave me (and thus basically everyone in a similar
> position) with these options:

Yes, glibc internal calls to functions like read() and write() are
*not* done through PLT calls.  It means that symbols interposition
does not work for such cases.

> 
>  a) also wrap essentially /every/ stdio function that is not
>     guaranteed to only call already wrapped functions,
> 
>  b) use the Linux Auditing System instead, or
> 
>  c) use ptrace() instead and reimplement 3/4 of ltrace/strace.
> 
> Neither prospect has me rejoicing as they involve either a ton of work
> and/or sacrificing portability.
> 
> What am I supposed to do?

For Linux you have seccomp filters [1] and with 3.5+ you can optimize
it a bit by setting only the syscalls you are interested.  Mike Frysinger 
discussed with some options on a previous thread [2].

> 
> I'm currently examining what it would take for option a), but I'm
> running into a steady stream of roadblocks.  A major one I'm stuck at
> are the variadic functions (printf and friends).  One way out seems to
> be to use GCC's __builtin_apply and calculate its size argument using
> a function that would have to be similar to glibc's
> parse_printf_format, but which would only return the number of bytes
> the arguments occupy on the stack (Would it be too much to ask to
> provide such a function as part of glibc?).  But I don't know if
> register-involving calling conventions will harmonize with that
> approach.  Will they?  Also what makes me reluctant to explore this
> further is the fear that I will eventually have to implement not just
> wrappers, but full-on replacements.  And I'd probably have to do the
> same for libstdc++, too.

Depending of what you intended to catch you will need *a lot* of 
boilerplate for this approach indeed.  I haven't explored a way to
interpose variadic functions, but afaik you can't really do it in
*portable* way (you will need to either resort in a compiler or
ABI extension).

> 
> Thanks in advance for any help/clarification.
> 
> P.S.: Solutions that involve installing a specially built version of
> glibc (e.g. with INLINE_SYSCALL undefined) are less than ideal because
> this project is not for personal, but public use, and having a
> custom-built libc as a dependency is thus effectively a showstopper.
> But maybe it would be possible to transplant a subset of routines from
> such a libc into my library.  But how would I even do that?  Close
> study of build logs tells me one of stdio-common/stamp.o and
> libc_pic.{a,os,os.clean} probably contains what I want, but I'm not
> sure what will break if I just copy that over.
> 

This was suggested some time ago and it is the idea of libOS [3].
On that thread there is some discussion on pro and cons with this
approach.  You can also check how it has done it.


[1] https://sourceware.org/pipermail/libc-help/2021-August/006002.html
[2] https://sourceware.org/pipermail/libc-help/2021-August/006002.html
[3] https://sourceware.org/legacy-ml/libc-alpha/2019-09/msg00188.html

      reply	other threads:[~2021-09-09 10:51 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-04 19:00 Dennis Filder
2021-09-09 10:51 ` Adhemerval Zanella [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9fb7adcd-c840-6873-0e27-23993a7c9f0d@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=d.filder@web.de \
    --cc=libc-help@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).