From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf31.google.com (mail-qv1-xf31.google.com [IPv6:2607:f8b0:4864:20::f31]) by sourceware.org (Postfix) with ESMTPS id 48EA13857403 for ; Thu, 9 Sep 2021 10:51:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 48EA13857403 Received: by mail-qv1-xf31.google.com with SMTP id r18so364846qvy.8 for ; Thu, 09 Sep 2021 03:51:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:subject:to:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=X7snEgrbycEECWIH7ItwvUNvtb6SNumueG3NXL7+9yg=; b=6D+KkK0JdoaBuSgEMXDrpa25KnHODFbf1jbruZ9zIyUWUuV4LitdbyBt8/vvZ63+Yt J0EOZAuMzrFbVwvle+juNyeHKguNIR+QXyRtoBec1BqR8Rz8Is2HhL8wSBFEnBBzJayl fyZTrO10dTd0SWGe15AHBBxNc9A+A2vYEQCnrYZe6pAKUTvELedGnL1n8p7Z58CESWwY 47ssBULfkMep1K9SvzHwZkb9dgzlB5UvRvYLXxz+74x/unP5PYUFSTNBjGzfMba2Q1Kh nZZxu1yRcrUP/lp7/oFBJSh+ywlfgZY5PpwZa0peO8XhDITVNx73FFqTmtbz2dyeqnwp CVPA== X-Gm-Message-State: AOAM532MyuVntcld86n4Th6df7knNNYXXW09KMqgi6A2+IOtXmRO36xf AAClyTOVtq5oD4GJclc1XVB7gXVA0zyAqQ== X-Google-Smtp-Source: ABdhPJwHGPPXcgFP63RQHR0Odj5qleqOpWD5plwBZs0JwdVJPB3J0Vj8HKHFKHl3rliNMd+d1GbQAQ== X-Received: by 2002:a0c:e1ce:: with SMTP id v14mr2217587qvl.28.1631184712710; Thu, 09 Sep 2021 03:51:52 -0700 (PDT) Received: from ?IPv6:2804:431:c7cb:733d:2b78:e6e3:9720:d25d? ([2804:431:c7cb:733d:2b78:e6e3:9720:d25d]) by smtp.gmail.com with ESMTPSA id i27sm1007007qkl.111.2021.09.09.03.51.51 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 09 Sep 2021 03:51:52 -0700 (PDT) From: Adhemerval Zanella Subject: Re: LD_PRELOAD wrappers for system calls and stdio To: Dennis Filder , Libc-help References: Message-ID: <9fb7adcd-c840-6873-0e27-23993a7c9f0d@linaro.org> Date: Thu, 9 Sep 2021 07:51:50 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-help@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-help mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2021 10:51:55 -0000 On 04/09/2021 16:00, Dennis Filder via Libc-help wrote: > Hi, > > I'm trying to write an LD_PRELOAD hack for the purpose of > tracing/logging. The goal was to get a time-stamped copy of > everything that is ever read/written over a selected set of file > descriptors and log it to another file descriptor. A second goal was > achieving a high degree of portability. > > My naive hope was that I could implement this by wrapping just a > handful of system call wrappers and be done with it. Imagine my > surprise when in the process of coding that up (under Linux) I came to > notice that not all stdio functions use write() internally to actually > write data to a file descriptor. Some, e.g. fwrite(), do something > esoteric which involves book-keeping with a FILE object and also an > apparently in-lined invocation of syscall(SYS_write, ...). If my > understanding is correct then this means that it is literally > impossible to attain my goal by wrapping only the system call wrappers > which would leave me (and thus basically everyone in a similar > position) with these options: Yes, glibc internal calls to functions like read() and write() are *not* done through PLT calls. It means that symbols interposition does not work for such cases. > > a) also wrap essentially /every/ stdio function that is not > guaranteed to only call already wrapped functions, > > b) use the Linux Auditing System instead, or > > c) use ptrace() instead and reimplement 3/4 of ltrace/strace. > > Neither prospect has me rejoicing as they involve either a ton of work > and/or sacrificing portability. > > What am I supposed to do? For Linux you have seccomp filters [1] and with 3.5+ you can optimize it a bit by setting only the syscalls you are interested. Mike Frysinger discussed with some options on a previous thread [2]. > > I'm currently examining what it would take for option a), but I'm > running into a steady stream of roadblocks. A major one I'm stuck at > are the variadic functions (printf and friends). One way out seems to > be to use GCC's __builtin_apply and calculate its size argument using > a function that would have to be similar to glibc's > parse_printf_format, but which would only return the number of bytes > the arguments occupy on the stack (Would it be too much to ask to > provide such a function as part of glibc?). But I don't know if > register-involving calling conventions will harmonize with that > approach. Will they? Also what makes me reluctant to explore this > further is the fear that I will eventually have to implement not just > wrappers, but full-on replacements. And I'd probably have to do the > same for libstdc++, too. Depending of what you intended to catch you will need *a lot* of boilerplate for this approach indeed. I haven't explored a way to interpose variadic functions, but afaik you can't really do it in *portable* way (you will need to either resort in a compiler or ABI extension). > > Thanks in advance for any help/clarification. > > P.S.: Solutions that involve installing a specially built version of > glibc (e.g. with INLINE_SYSCALL undefined) are less than ideal because > this project is not for personal, but public use, and having a > custom-built libc as a dependency is thus effectively a showstopper. > But maybe it would be possible to transplant a subset of routines from > such a libc into my library. But how would I even do that? Close > study of build logs tells me one of stdio-common/stamp.o and > libc_pic.{a,os,os.clean} probably contains what I want, but I'm not > sure what will break if I just copy that over. > This was suggested some time ago and it is the idea of libOS [3]. On that thread there is some discussion on pro and cons with this approach. You can also check how it has done it. [1] https://sourceware.org/pipermail/libc-help/2021-August/006002.html [2] https://sourceware.org/pipermail/libc-help/2021-August/006002.html [3] https://sourceware.org/legacy-ml/libc-alpha/2019-09/msg00188.html