From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 53320 invoked by alias); 26 Jun 2018 09:26:22 -0000 Mailing-List: contact gnu-gabi-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Post: List-Help: List-Subscribe: Sender: gnu-gabi-owner@sourceware.org Received: (qmail 51680 invoked by uid 89); 26 Jun 2018 09:26:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.99.4 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,SPF_HELO_PASS autolearn=no version=3.3.2 spammy=consequences, spread, libgcc, acquires X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,SPF_HELO_PASS autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: X-Spam-User: qpsmtpd, 3 recipients X-HELO: mx1.redhat.com Received: from mx3-rdu2.redhat.com (HELO mx1.redhat.com) (66.187.233.73) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 26 Jun 2018 09:26:20 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4D3C24000B74; Tue, 26 Jun 2018 09:26:18 +0000 (UTC) Received: from oldenburg.str.redhat.com (ovpn-116-177.ams2.redhat.com [10.36.116.177]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8A14D1C595; Tue, 26 Jun 2018 09:26:17 +0000 (UTC) To: GCC , GNU C Library , Binutils , gnu-gabi@sourceware.org From: Florian Weimer Subject: Invalid program counters and unwinding Message-ID: Date: Mon, 01 Jan 2018 00:00:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Tue, 26 Jun 2018 09:26:18 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Tue, 26 Jun 2018 09:26:18 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'fweimer@redhat.com' RCPT:'' X-IsSubscribed: yes X-SW-Source: 2018-q2/txt/msg00014.txt.bz2 I'm looking at ways to speed up _Unwind_Find_FDE when libgcc is running on top of glibc. I have something (at the design level, with some of the code written) which allows me to get a pointer to the PT_GNU_EH_FRAME segment in memory in a lock-free fashion (so it would also be async-signal safe). This part works also when the program counter used in the search is invalid and does not point to within a loaded object, even in the case of concurrent dlopen/dlclose. However, it's still necessary to read the PT_GNU_EH_FRAME data itself, and if _Unwind_Find_FDE is not a valid program counter found on the stack (with in a caller, where unmapping it with dlclose would be invalid), it could happen that it is a random address in *another*, unrelated object, which then gets dlclose'd (which is valid). The current glibc-based implementation in libgcc calls dl_iterate_phdr, which acquires a lock blocking dlclose for the entire duration of the iteration. But I think this still doesn't support arbitrary, random PC values because in the worst case, the PC value looks valid, we find some unrelated FDE data with an associated personality routine, and end up calling that, with disastrous consequences. So it looks to me that the caller of _Unwind_Find_FDE needs to ensure that the PC is a valid element of the call stack. Is this a correct assumption? I have some ideas how make reading the PT_GNU_EH_FRAME data safe, but the question is whether we actually need that. Previous discussions: https://gcc.gnu.org/ml/gcc/2013-05/msg00253.html https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744 https://sourceware.org/ml/libc-alpha/2016-07/msg00613.html (patch with a spread lock, still not async-signal-safe) Thanks, Florian