From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-0010f301.pphosted.com (mx0b-0010f301.pphosted.com [148.163.153.244]) by sourceware.org (Postfix) with ESMTPS id F40F93858D1E for ; Wed, 29 Mar 2023 14:11:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F40F93858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=rice.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rice.edu Received: from pps.filterd (m0102858.ppops.net [127.0.0.1]) by mx0b-0010f301.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32TCfecE015125; Wed, 29 Mar 2023 09:11:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rice.edu; h=message-id : subject : from : to : date : in-reply-to : references : content-type : content-transfer-encoding : mime-version; s=ricemail; bh=l4HwIH0cXFjUITqVor7se6PsXi+xy3KAEOw7Ij2fZXk=; b=PV2uXmmGs2ZegY1IMJNUM02XQfXaRiDGYpMvMASoC447qlcl/QEgSI7kPgCjqtIgkczT esKddsh8lM5dwvwaB+SnbE16mwNJmjVkrQfHeBCT5cAQZ2oX9pMxgQ/pJAr60KgXFDAV A1Mcup1O/clBla7cn8ZJt2dSdJvPqiHNSIfRSMtqkhlp2na8zZaURtKxJUSKbbeM/xWH +VUX75NN+XfvgEJRJWXWa+UWCa+OksZG6iOjxjqvl/MoUVfYg7loN38P0C0bDkR7PTxZ y+KllFExKhIUjxLHDJVbDLi3BMOHAWKcSfBDHLioQ4HjC8/ouV102T7cbrHPCioFBnz9 nQ== Received: from mh3.mail.rice.edu (mh3.mail.rice.edu [128.42.199.10]) by mx0b-0010f301.pphosted.com (PPS) with ESMTPS id 3pmc840kqh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 29 Mar 2023 09:11:02 -0500 Received-X: from mh3.mail.rice.edu (localhost [127.0.0.1]) by mh3.mail.rice.edu (Postfix) with ESMTP id 7D48F606D0C; Wed, 29 Mar 2023 09:10:50 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mh3.mail.rice.edu (Postfix) with ESMTP id 7C31B6069DD; Wed, 29 Mar 2023 09:10:50 -0500 (CDT) X-Virus-Scanned: by amavis-2.12.1 at mh3.mail.rice.edu, auth channel Received: from mh3.mail.rice.edu ([127.0.0.1]) by localhost (mh3.mail.rice.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 51SBC4ApZWHz; Wed, 29 Mar 2023 09:10:50 -0500 (CDT) Received: from deepthought.hsd1.tx.comcast.net (c-76-30-157-230.hsd1.tx.comcast.net [76.30.157.230]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: jma14) by mh3.mail.rice.edu (Postfix) with ESMTPSA id 44079209205; Wed, 29 Mar 2023 09:10:50 -0500 (CDT) Message-ID: <630fa17528c6050d60f524aa88ad5a057cae1603.camel@rice.edu> Subject: Re: [PATCH 12/13] dlfcn,elf: implement dlmem() [BZ #11767] From: Jonathon Anderson To: stsp , Carlos O'Donell , libc-alpha@sourceware.org Date: Wed, 29 Mar 2023 09:10:49 -0500 In-Reply-To: <3541bbd7-8a68-2064-bb63-2a921cfe3bb1@yandex.ru> References: <20230318165110.3672749-1-stsp2@yandex.ru> <20230318165110.3672749-13-stsp2@yandex.ru> <3541bbd7-8a68-2064-bb63-2a921cfe3bb1@yandex.ru> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4-1 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-29_08,2023-03-28_02,2023-02-09_01 X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 2023-03-29 at 18:51 +0500, stsp via Libc-alpha wrote: > ``` >=20 > 29.03.2023 18:45, Carlos O'Donell =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > On 3/18/23 12:51, Stas Sergeev via Libc-alpha wrote: > This patch adds the following function: > > > void *dlmem(const unsigned char *buffer, size_t size, int flags, > > > struct dlmem_args *dlm_args); >=20 > > I am raising a sustained objection to including dlmem() in glibc. > >=20 > > I appreciate your effort in working on this serious, and I think *many*= of > > the core changes you propose are good cleanups. > >=20 > > In my experience it is the wrong level of abstraction. > >=20 > > To implement fdlopen on top of dlmem requires PT_LOAD processing and th= at > > will duplicate into userspace a significant part of the complexity of E= LF > > loading and segment handling. The test case you include below is incomp= lete > > when it comes to implementing a functional fdlopen() across the support= ed > > architectures and toolchain variants. >=20 >=20 > Carlos, please, be reasonable, I don't understand > what do you want from me. :) > Why my test-cases are incomplete? > Why some "complete" test-case needs an elf > parsing? Will this ever be demonstrated with > some example or anything? > I am really a bit tired of that permanently > recurring argument about some elf parsing. > I don't even know what are you talking about. >=20 > Where, just where have you seen it in my > patch? Or if its not there, why do you mention > it? Stas, Please do some research into the ELF file format. Neither your fdlopen impl= ementation in the test cases nor your dlopen_with_offset implementation in = the email chain implement it correctly. AFAICT, the first glaring issue with both of your implementations is that y= ou have neglected the case where p_offset !=3D p_vaddr, i.e. a segment is m= mapped to a different location than its layout in the file. There are a LOT= of binaries out in the wild where this is the case. Here's a quick one-lin= er to help you find some on your own box, I have 11712 such binaries on my = Debian system: find /usr/lib -type f -exec grep -aq '^.ELF' {} \; -print 2>/dev/null |= while read bin; do if readelf -l $bin | grep LOAD | grep -vE 'LOAD[[:space= :]]+(0x[0-9a-f]+) (\1)'; then echo $bin; fi; done The second glaring issue (from my perspective) is that you are mmapping the= entire file, instead of just the executable code. I have personally compil= ed up binaries where the DWARF debugging information was far larger than th= e code, one extreme case was a ~7.7GB binary of which merely ~130MB was .te= xt. It is critical for performance that only that ~130MB is loaded from dis= k in the nominal case. There are likely more glaring issues, these are just the two that first cam= e to mind. Needless to say, both of these issues require the user of dlmem() to parse = the program headers of the binary and mmap() the binary according to the EL= F standard. This is a complex and delicate process, and is already implemen= ted in full in Glibc. The fact that users have to duplicate this implementa= tion makes dlmem() a terrible API abstraction and, as others have asserted,= unsuitable for inclusion Glibc. -Jonathon