From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-0010f301.pphosted.com (mx0b-0010f301.pphosted.com [148.163.153.244]) by sourceware.org (Postfix) with ESMTPS id E43A53858406 for ; Mon, 22 Nov 2021 17:46:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E43A53858406 Received: from pps.filterd (m0102858.ppops.net [127.0.0.1]) by mx0b-0010f301.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 1AM9xJQ7005495; Mon, 22 Nov 2021 11:46:38 -0600 Received: from mh3.mail.rice.edu (mh3.mail.rice.edu [128.42.199.10]) by mx0b-0010f301.pphosted.com (PPS) with ESMTPS id 3cftqss4pt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 22 Nov 2021 11:46:38 -0600 Received-X: from mh3.mail.rice.edu (localhost [127.0.0.1]) by mh3.mail.rice.edu (Postfix) with ESMTP id B1943604A70; Mon, 22 Nov 2021 11:46:37 -0600 (CST) Received: from localhost (localhost [127.0.0.1]) by mh3.mail.rice.edu (Postfix) with ESMTP id B0048604A6D; Mon, 22 Nov 2021 11:46:37 -0600 (CST) X-Virus-Scanned: by amavis-2.12.1 at mh3.mail.rice.edu, auth channel Received: from mh3.mail.rice.edu ([127.0.0.1]) by localhost (mh3.mail.rice.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id wI23KItyxqS9; Mon, 22 Nov 2021 11:46:30 -0600 (CST) Received: from 127.0.0.1 (webmail-n7.rice.edu [10.130.81.194]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: jma14) by mh3.mail.rice.edu (Postfix) with ESMTPSA id AFFB8208DCD; Mon, 22 Nov 2021 11:46:30 -0600 (CST) Received: from cslinux29.dyndns.rice.edu (cslinux29.dyndns.rice.edu [10.84.20.65]) by webmail.rice.edu (Horde Framework) with HTTPS; Mon, 22 Nov 2021 11:46:30 -0600 Date: Mon, 22 Nov 2021 11:46:30 -0600 Message-ID: <20211122114630.Horde.vByjLyp4gX9-IpeJ-NBFLp8@webmail.rice.edu> From: jma14 To: Florian Weimer Cc: John Mellor-Crummey , Xiaozhu Meng , "Mark W. Krentel" , libc-alpha@sourceware.org Subject: Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes User-Agent: Horde Application Framework 5 Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline X-Proofpoint-ORIG-GUID: WVgZdgiFkurpOPVz72dkHZ1j0UWCWYtZ X-Proofpoint-GUID: WVgZdgiFkurpOPVz72dkHZ1j0UWCWYtZ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-22_08,2021-11-22_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 lowpriorityscore=0 malwarescore=0 mlxlogscore=987 suspectscore=0 spamscore=0 adultscore=0 mlxscore=0 impostorscore=0 priorityscore=1501 clxscore=1015 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111220091 X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Nov 2021 17:46:41 -0000 On 11/19/21 13:18, Florian Weimer wrote: > * Jonathon Anderson: >>>> Right now, we only require the program headers which we can obtain >>>> from getauxval(AT_PHDR), however this technique has questionable >>>> portability and robustness (getauxval returns an unsigned long, not a >>>> pointer). > >>> A glibc port to an architecture where a long value cannot hold all >>> pointer values will have to provide an alternative interface similar >>> to getauxval, but that returns pointer values. > >> I would go one step farther and say getauxval is already broken for any >> 64-bit architecture, unsigned long is only required to support 32 bits >> as per the C standards. One of my greater fears is that some exotic >> compiler will cleverly allocate only 4 bytes of stack space for the >> return value, and we wouldn't know except for a subtle bug (dependent >> on optimization flags!) that crashes our entire tool with SEGVs in the >> auditor (where GDB doesn't give properly unwound call stacks). > > If ported to such an architecture, glibc would need several changes to > accomodate this. Newer architectures take this into account and do not > do funny things. Ah, right, I forgot the sizes of the standard types are part of the GNU ABI. That resolves my concern then. >> This is a much clearer description of the semantic, it would be very >> helpful the man pages used that sentence (or one like it) wherever the >> l_addr value is exposed in the API (link_map->l_addr and >> dl_phdr_info->dlpi_addr). It would also be very helpful if >> Dl_info.dli_fbase was clearly documented as *not* l_addr but instead >> byte 0/ELF header in the process image. > > I've made a note to update the manual pages. Thanks! >>>> dladdr gets its value from link_map->l_map_start instead of l_addr, >>>> so the semantic we want is already present in a private field. It >>>> seems to me these two fields could be swapped with little issue, if >>>> altering the public semantic is not acceptable we could also be sated >>>> if l_map_start was made public. >>> >>> Applications which know about the current semantics of l_addr will >>> break, though. l_addr is also exposed to debuggers via the _r_debug >>> interface. I really do not think we can make changes to l_addr. We >>> have a similar issue around l_name being "" for the main program, and >>> unfortuantely I will have to argue quite strongly against changing >>> that. > >> Is adding new public fields completely off the table? > > To struct link_map? We could probably pull it off, but it would be > years until such a change will be in the hands of the users. There is > an internal structure that overlaps with the public struct link_map, and > some applications poke at the private bits at fixed offsets. We've > started not to strip ld.so downstream, so that these applications can > switch to DWARF data to avoid dependencies on fixed offsets, but that > has been a very recent change. I would consider poking at private fields a last resort, portability across (versions of) GNU/Linux distributions is something we fight with on occasion. It also doesn't help the l_name issue. >> I am keeping my hopes low for changing the public semantics, but I hope >> that adding new fields for the missing data at least could fall into >> the "bug fixes" category. > > We can in theory add new flags to dladdr1 or new requests to dlinfo. > These things do not impact ABI and are backportable. These options would not help the issue. dlinfo (currently) crashes in the auditor, and using it relies on the (undocumented) fact that dlopen handles are in fact just struct link_map*. dladdr already provides the corrected values, adding a new option to dladdr1 isn't needed. > Changing struct link_map is challenging, as I explained. Adding new > functions definitely triggers deployment delays because those aren't > considered backportable. Adding fields to existing structs is usually > impossible because these structs may have been used to define > application types, and new fields cause subsequent offsets to change. Ah, right. Can the new fields be added in a future release? If so, then with backported fixes to the dladdr crashes dladdr(l_ld) can be used as a portable workaround, provided it works as intended on all Glibc targets (it works on x86-64/Linux but I'll need confirmation for others). There is a performance cost to dladdr so I still think the new fields are required, it also gives the workaround a clear lifetime. If new fields are completely impossible (now and forever), then as an LD_AUDIT user the only option (brought up so far) I feel comfortable with is a change to the l_addr/l_name semantics. It's ABI-compatible, I have a hard time believing there are applications that will regress because of it, and it fixes an interface that was *never* quite right (and differed from its documentation). -Jonathon