public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
* glibc interception problems
@ 2021-05-25  2:30 肖鹏
  2021-05-25  3:10 ` Mike Frysinger
  0 siblings, 1 reply; 3+ messages in thread
From: 肖鹏 @ 2021-05-25  2:30 UTC (permalink / raw)
  To: libc-help

Hi ,


I'm developping a library for a distributed file system, using LD_PRELOAD to intercept file related calls to glibc.
Programs directly using unbuffered IO (close/open/read/write) work well with my library. But programs using buffered IO (fclose/fopen/fread/fwrite) will call unbuffered functions of glibc instead of my library. 
The reason is that in glibc, buffered IO depend on internal symbols, which cannot be intercepted. 
I can see two ways to work around this:
1. Implement buffered IO in my library.
2. Modify glibc buffered IO to use public symbols, and maintain a custom version of glibc.
Both ways have drawbacks. Do you have any suggestions about this?


Thanks very much! Hope for your response.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: glibc interception problems
  2021-05-25  2:30 glibc interception problems 肖鹏
@ 2021-05-25  3:10 ` Mike Frysinger
       [not found]   ` <4f73c540.6a6.179a6344524.Coremail.xiaopeng_phy@163.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Frysinger @ 2021-05-25  3:10 UTC (permalink / raw)
  To: 肖鹏; +Cc: libc-help

On 25 May 2021 10:30, 肖鹏 via Libc-help wrote:
> I'm developping a library for a distributed file system, using LD_PRELOAD to intercept file related calls to glibc.
> Programs directly using unbuffered IO (close/open/read/write) work well with my library. But programs using buffered IO (fclose/fopen/fread/fwrite) will call unbuffered functions of glibc instead of my library. 
> The reason is that in glibc, buffered IO depend on internal symbols, which cannot be intercepted. 
> I can see two ways to work around this:
> 1. Implement buffered IO in my library.
> 2. Modify glibc buffered IO to use public symbols, and maintain a custom version of glibc.
> Both ways have drawbacks. Do you have any suggestions about this?

what is your end goal ?

is this a research project ?
are you developing an open source project that you want to share with others ?
something else ?

if it's a short-term project (e.g. for research), (2) might be your best bet.
it's hacky, but if it gathers the info you need, it's not that bad.

if it's a long-term project you want to share with others, i'd recommend not
using LD_PRELOAD at all.  maybe FUSE would be better.
-mike

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Re: glibc interception problems
       [not found]   ` <4f73c540.6a6.179a6344524.Coremail.xiaopeng_phy@163.com>
@ 2021-05-26  3:47     ` Mike Frysinger
  0 siblings, 0 replies; 3+ messages in thread
From: Mike Frysinger @ 2021-05-26  3:47 UTC (permalink / raw)
  To: 肖鹏; +Cc: libc-help

On 26 May 2021 09:05, 肖鹏 wrote:
> Thank you for your immediate response.
> I'm developping an open source distributed file system (https://github.com/chubaofs/chubaofs). Yes, we are currently using FUSE.
> But FUSE is slow, because of kernel context switch and data copy. We are developping a kernel bypass implementation, which increase QPS by 70%-100% in our test.
> The use of LD_PRELOAD has been carefully assessed, and is limited to our own servers.
> Maintaining a custom version of glibc makes the server environment complicated and less maintainable. I worry that other applications would be affected, because glibc is too fundamental.
> So, I'm wondering if there is other ways.

there is no way to intercept the symbols glibc calls internally by design.
it speeds glibc up and is not part of glibc's ABI guarantees.

even if you could, it wouldn't help with static programs, or with programs
that make the syscall directly (while not common, it's not unheard of).

if you really need to capture specific syscalls, your choices are basically:
* FUSE
* ptrace
* in-kernel implementation of the FS

you might be able to optimize the ptrace implementation via seccomp filters.
libseccomp can help create those filters.
-mike

> At 2021-05-25 11:10:35, "Mike Frysinger" <vapier@gentoo.org> wrote:
> >On 25 May 2021 10:30, 肖鹏 via Libc-help wrote:
> >> I'm developping a library for a distributed file system, using LD_PRELOAD to intercept file related calls to glibc.
> >> Programs directly using unbuffered IO (close/open/read/write) work well with my library. But programs using buffered IO (fclose/fopen/fread/fwrite) will call unbuffered functions of glibc instead of my library. 
> >> The reason is that in glibc, buffered IO depend on internal symbols, which cannot be intercepted. 
> >> I can see two ways to work around this:
> >> 1. Implement buffered IO in my library.
> >> 2. Modify glibc buffered IO to use public symbols, and maintain a custom version of glibc.
> >> Both ways have drawbacks. Do you have any suggestions about this?
> >
> >what is your end goal ?
> >
> >is this a research project ?
> >are you developing an open source project that you want to share with others ?
> >something else ?
> >
> >if it's a short-term project (e.g. for research), (2) might be your best bet.
> >it's hacky, but if it gathers the info you need, it's not that bad.
> >
> >if it's a long-term project you want to share with others, i'd recommend not
> >using LD_PRELOAD at all.  maybe FUSE would be better.
> >-mike

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-05-26  3:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-25  2:30 glibc interception problems 肖鹏
2021-05-25  3:10 ` Mike Frysinger
     [not found]   ` <4f73c540.6a6.179a6344524.Coremail.xiaopeng_phy@163.com>
2021-05-26  3:47     ` Mike Frysinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).