From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 1DE403858D1E for ; Mon, 20 Feb 2023 15:50:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1DE403858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1676908248; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iuJ2PqSZe9PuOvVcORvGgZi6FjUNors2m2qf2EwmymY=; b=KzhuMHqpdYBprgxD9D+RpKLIeRscT8AtdbxxAXl0jnf4lCIFGb86RvF1VTMOB7QvRw2qmI Y+PQi6eaJNpbpZNCeWWIO2AZl2yAwBWTon6Pe3d79FvSz+aLch7oMlCX3X0Cs1ySJEaS8v BgCL8fG6xTNctN7mI3ebtgPaNJppskk= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-126-1OWkpDIVOPeEmOw3CnKgyw-1; Mon, 20 Feb 2023 10:50:47 -0500 X-MC-Unique: 1OWkpDIVOPeEmOw3CnKgyw-1 Received: by mail-io1-f71.google.com with SMTP id s1-20020a6bd301000000b0073e7646594aso1188335iob.8 for ; Mon, 20 Feb 2023 07:50:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iuJ2PqSZe9PuOvVcORvGgZi6FjUNors2m2qf2EwmymY=; b=2PhQEdBTjdVsV6F/00B/bZmrw71eoH+9oA1FrVe1pbLQO8i0zv3GePE/8y+LDXpQqk TK7+57v5TW7u22l1PtVhogodbBXcy1m4etLsQHPys1jPW9NdQqzs3/+N1GHk6fxiuBQK cR517y5M3JPL+5KHmC4WzhYw5kj9d5fGLveLpJyY1PZvHKgCQz6+pi1VV+XHJGiWmcAV kCcA5gNavf49Wg6VAMS9GnNIKliJp19AZ1+SvwAQQI6+/okQMK/Z3F10Qikz65cRM2fq jXhQldTKkyQJhD5AaQ1OhsptX5YHMP4zVq45ctCm1XS45WK+i7EGTWp9VswjoyEjKZwn 1gzw== X-Gm-Message-State: AO0yUKVpuCO1GFrYxMO4meJkYh/n7Zx4st0bZ5v/ECPganURjQ7+C65+ n4VRdg/T/vIWKKIfIPuBh9GcJ4bddWoEmdaxEmMRpBb6JQf1vxDHBQK6mz7oZAIzBrRUZBGdJrg Hb43yPeCsNq7w1gefOFag X-Received: by 2002:a5e:a80b:0:b0:71e:15f:2911 with SMTP id c11-20020a5ea80b000000b0071e015f2911mr235092ioa.1.1676908246803; Mon, 20 Feb 2023 07:50:46 -0800 (PST) X-Google-Smtp-Source: AK7set/be2ztpoycTYjDcpiBPKGP8xmmi6b66TJSYXRUy3QCnXpnuu5Q4bPEmM8CRy+XyrOj5uDZvw== X-Received: by 2002:a5e:a80b:0:b0:71e:15f:2911 with SMTP id c11-20020a5ea80b000000b0071e015f2911mr235079ioa.1.1676908246470; Mon, 20 Feb 2023 07:50:46 -0800 (PST) Received: from [192.168.0.241] ([198.48.244.52]) by smtp.gmail.com with ESMTPSA id u3-20020a02c043000000b00363362cd476sm307254jam.101.2023.02.20.07.50.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 20 Feb 2023 07:50:45 -0800 (PST) Message-ID: Date: Mon, 20 Feb 2023 10:50:44 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [PATCH 2/3] dlfcn,elf: implement dlmem() and audit [BZ #11767] To: Stas Sergeev , libc-alpha@sourceware.org, Paul Pluzhnikov References: <20230215165541.1107137-1-stsp2@yandex.ru> <20230215165541.1107137-3-stsp2@yandex.ru> From: Carlos O'Donell Organization: Red Hat In-Reply-To: <20230215165541.1107137-3-stsp2@yandex.ru> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2/15/23 11:55, Stas Sergeev via Libc-alpha wrote: > This patch adds the following function: > void *dlmem(const unsigned char *buffer, size_t size, int flags); Stas, Thank you very much for this contribution. This patch series was raised during the weekly patch review and I wanted to take the time review before your implementation moves too far forward. My understanding of your use case is based on the discussion you had in bug 30007, and I also reviewed bug 11767 for background. I also reviewed the discussions on libc-alpha from Joseph Myers, Florian Weimer, and Andreas Schwab. Florian has open questions about gdb and audit requirements which I didn't see addressed in this series. tl;dr My position is that the semantics of dlmem() is at a level of abstraction that impacts the present security, audit, and developer requirements, and that something like BSDs fdlopen() or Google's dlopen_with_offset() would serve as a better solution overall. Details ======= There are several paths forward for the uses cases discussed, but two of them come to mind and have seen some discussion on this list: (a) fdlopen() as in BSD. * All normal operations that dlopen would do are done, but starting from an fd. * Mirrors other new APIs e.g. signalfd, pidfd etc. Maybe we call dlopenfd() on Linux. (b) dlmem() as in your patch. The semantic issues I see with dlmem() are as follows: * No guarantee that the memory that is loaded meets the preconditions for the dynamic loader and application uses. - We could argue that it is the application that needs to meet those requirements or the system is in an undefined state, however the better an API is the more it ensures by design that we cannot be in such states. * The API pushes the abstraction of a "file descriptor" completely out of the design and in doing so introduces a level of change that requires a new LD_AUDIT interface, and likely other tooling changes. If instead the semantics are raised up a level to fdlopen() then we have: * Application developer can still do everything they wanted, but they need one additional syscall, memfd_create() to turn the memory into a file descriptor and that has the added benefit of being a kernel-side auditable event which is already used in JIT/FFI e.g. libffi. - Kernel support appeared in Linux 3.17 and glibc 2.27 (2018). - Kernel lockdown of memfd_create() is a known issue and discussed by JITs with kernel developers e.g. OpenJDK uses memfd_create(), libffi uses memfd_create(). * Additionally users with the use case that they have an embedded DSO on disk now need to do less work in their application because they can use fdlopen() with an fd already open and at a suitable offset. Note that the fd passed is never closed. * No new LD_AUDIT interfaces reuqired so auditors and developer tooling does not need to be updated. - Though now it's possible to skip la_objsearch for the opening of the object, but that was always a possible scenario. As of today I Google has dlopen_with_offset(), almost fdlopen(), in google/grte/v5-2.27/master, but it has not been contributed for general inclusion. This to me seems to indicate that industry best practice today continues to be around the use of a file descriptor. My suggestion therefore is to attempt to refactor what you have around an API that is like fdlopen(), and see what the implementation and performance looks like in glibc. Thoughts? > It is the same as dlopen() but allows to dynamic-link solibs from > the memory buffer, rather than from a file as dlopen() does. > > "buffer" arg is the pointer to the solib image in memory. > "size" is the solib image size. Must be smaller-or-equal to the > actual buffer size. > "flags" is the same flags argument used in dlopen(). > > The idea behind the implementation is very simple: where the > dlopen() would mmap() the file, dlmem() does anonymous > mmap()+memcpy(). > Unfortunately the glibc code was too bound to the file reads, > so the patch looks bigger than it should. Some refactorings > were needed to avoid big copy/pasts and code duplications. > In particular, _dl_map_object_from_fd() was split and now calls > 2 functions that are also called from __dl_map_object_from_mem(). > The same treatment was applied to open_verify() - the part that > can be shared, was split into do_open_verify(). > > This patch adds a test-case named tst-dlmem. > > This patch also adds the _dl_audit_premap_dlmem LD_AUDIT > extension. It passes map length to la_premap_dlmem() and gets > back an fd. If returned fd==-1 then the dlmem() space is > anonymously mapped, which is a default behavior w/o audit. > But it is possible to create a shm area of the needed size > with shm_open()/ftruncate() and return that fd. Then you > will be able to access the solib image via that fd, so that > you can mmap() it to another process or another address of > the same process. > > The test-case for that functionality is called tst-audit-dlmem. > > The test-suite was run on x86_64/64 and showed no regressions. > > Signed-off-by: Stas Sergeev -- Cheers, Carlos.