public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
From: Mike Frysinger <vapier@gentoo.org>
To: 肖鹏 <xiaopeng_phy@163.com>
Cc: libc-help@sourceware.org
Subject: Re: Re: glibc interception problems
Date: Tue, 25 May 2021 23:47:12 -0400	[thread overview]
Message-ID: <YK3EwLVYD2ygeQG4@vapier> (raw)
In-Reply-To: <4f73c540.6a6.179a6344524.Coremail.xiaopeng_phy@163.com>

On 26 May 2021 09:05, 肖鹏 wrote:
> Thank you for your immediate response.
> I'm developping an open source distributed file system (https://github.com/chubaofs/chubaofs). Yes, we are currently using FUSE.
> But FUSE is slow, because of kernel context switch and data copy. We are developping a kernel bypass implementation, which increase QPS by 70%-100% in our test.
> The use of LD_PRELOAD has been carefully assessed, and is limited to our own servers.
> Maintaining a custom version of glibc makes the server environment complicated and less maintainable. I worry that other applications would be affected, because glibc is too fundamental.
> So, I'm wondering if there is other ways.

there is no way to intercept the symbols glibc calls internally by design.
it speeds glibc up and is not part of glibc's ABI guarantees.

even if you could, it wouldn't help with static programs, or with programs
that make the syscall directly (while not common, it's not unheard of).

if you really need to capture specific syscalls, your choices are basically:
* FUSE
* ptrace
* in-kernel implementation of the FS

you might be able to optimize the ptrace implementation via seccomp filters.
libseccomp can help create those filters.
-mike

> At 2021-05-25 11:10:35, "Mike Frysinger" <vapier@gentoo.org> wrote:
> >On 25 May 2021 10:30, 肖鹏 via Libc-help wrote:
> >> I'm developping a library for a distributed file system, using LD_PRELOAD to intercept file related calls to glibc.
> >> Programs directly using unbuffered IO (close/open/read/write) work well with my library. But programs using buffered IO (fclose/fopen/fread/fwrite) will call unbuffered functions of glibc instead of my library. 
> >> The reason is that in glibc, buffered IO depend on internal symbols, which cannot be intercepted. 
> >> I can see two ways to work around this:
> >> 1. Implement buffered IO in my library.
> >> 2. Modify glibc buffered IO to use public symbols, and maintain a custom version of glibc.
> >> Both ways have drawbacks. Do you have any suggestions about this?
> >
> >what is your end goal ?
> >
> >is this a research project ?
> >are you developing an open source project that you want to share with others ?
> >something else ?
> >
> >if it's a short-term project (e.g. for research), (2) might be your best bet.
> >it's hacky, but if it gathers the info you need, it's not that bad.
> >
> >if it's a long-term project you want to share with others, i'd recommend not
> >using LD_PRELOAD at all.  maybe FUSE would be better.
> >-mike

      parent reply	other threads:[~2021-05-26  3:47 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-25  2:30 肖鹏
2021-05-25  3:10 ` Mike Frysinger
     [not found]   ` <4f73c540.6a6.179a6344524.Coremail.xiaopeng_phy@163.com>
2021-05-26  3:47     ` Mike Frysinger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YK3EwLVYD2ygeQG4@vapier \
    --to=vapier@gentoo.org \
    --cc=libc-help@sourceware.org \
    --cc=xiaopeng_phy@163.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).