From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by sourceware.org (Postfix) with ESMTP id 787223840C2F for ; Wed, 3 Jun 2020 22:06:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 787223840C2F Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-315-4MevjaQ_Pc6N1FJa_3B7Bw-1; Wed, 03 Jun 2020 18:06:40 -0400 X-MC-Unique: 4MevjaQ_Pc6N1FJa_3B7Bw-1 Received: by mail-qk1-f198.google.com with SMTP id h18so2854293qkj.13 for ; Wed, 03 Jun 2020 15:06:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=iM/66T74kzVhGnn6bJ73H0ZgzdJrsAZynW1198BKZuc=; b=Isan+IjxhSd9jBR4NymlmixapC8CKaD3uddmIpAGAhauzcmcsjPzQISLu+0LkWixO0 4EkAOG+SpRoJCIsENEFsRAlbc/7a7tRHwBJDgBkt5l5AOP1vc9e8XqZt5Faw1ZNOlWnR GeZEqXkWc63qfBKAWkfvTu7bJ3PR5RvQ+h7tXyB0QFdDNATSSyLrAd8kXGQskGaPPyCw g2BoGBGNzt7Qgoy0V9/oxK+T6QMyVCvKzJdrqrwGpyPjxEqRlvt7oee/dEnhPLKfRz+J ghpv7nlM8wbWXgL5nKgnvEOEfFwHa4W7P9/sDQSl8JW+tUT45sP4X3hNgeYsDwxpm7jS ZDaA== X-Gm-Message-State: AOAM530Og7TSEwv4sSFwsERIFwS1EIvfV1i+ii6tc4xAkWXCcsXyX7x2 c3DUqwAr9Sh/FLglFOq55DaD3VNZr9Vb1hdHd/QKLf1XhoCPwUvxCjs4tnFpbBUi/tQqaOcaQI0 9HfNNj5XSkAeaKasiVPHP X-Received: by 2002:a05:6214:405:: with SMTP id z5mr1947813qvx.112.1591221999969; Wed, 03 Jun 2020 15:06:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxAuk6b8KFGMsj3hanPDrVc34pU2ihX2EYB0lIuqUcjiCx/RxIQRHVhRW0yHg0rLSgE11Q8Nw== X-Received: by 2002:a05:6214:405:: with SMTP id z5mr1947792qvx.112.1591221999587; Wed, 03 Jun 2020 15:06:39 -0700 (PDT) Received: from [192.168.1.4] (198-84-170-103.cpe.teksavvy.com. [198.84.170.103]) by smtp.gmail.com with ESMTPSA id y13sm2917295qto.23.2020.06.03.15.06.38 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 03 Jun 2020 15:06:38 -0700 (PDT) Subject: Re: [PATCH] elf: Fix dlclose of an empty namespace in auditing mode (bug 26076) To: Florian Weimer Cc: libc-alpha@sourceware.org References: <87zh9kuuw1.fsf@oldenburg2.str.redhat.com> <5d08b5e0-c96a-0c97-304f-8509f7b0f322@redhat.com> <87h7vruaoi.fsf@oldenburg2.str.redhat.com> From: Carlos O'Donell Organization: Red Hat Message-ID: Date: Wed, 3 Jun 2020 18:06:37 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <87h7vruaoi.fsf@oldenburg2.str.redhat.com> Content-Language: en-US X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Jun 2020 22:06:46 -0000 On 6/3/20 5:00 PM, Florian Weimer wrote: > * Carlos O'Donell: > >> On 6/3/20 9:43 AM, Florian Weimer via Libc-alpha wrote: >>> ns->_ns_loaded is NULL if nothing has been loaded into the namespace. >>> >>> It seems difficult to hit this bug reliably, so this change does not >>> come with a test case. It was trigger by accident, due to TLS >>> exhaustion. >> >> I think this should fail catastrophically and quickly. > > Why should this be fatal to the process? Your backtrace does a good job explaining how you got to this point. I appreciate it, it is very illuminating. >>> elf/dl-close.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/elf/dl-close.c b/elf/dl-close.c >>> index 73b2817bbf..896e59e42e 100644 >>> --- a/elf/dl-close.c >>> +++ b/elf/dl-close.c >>> @@ -782,7 +782,7 @@ _dl_close_worker (struct link_map *map, bool force) >>> { >>> struct link_map *head = ns->_ns_loaded; >>> /* Do not call the functions for any auditing object. */ >>> - if (head->l_auditing == 0) >>> + if (head != NULL && head->l_auditing == 0) >>> { >>> struct audit_ifaces *afct = GLRO(dl_audit); >>> for (unsigned int cnt = 0; cnt < GLRO(dl_naudit); ++cnt) >>> >> >> Use _dl_signal_error to indicate an internal error? > > This *is* on the cleanup path after an error already happened. This is > the backtrace I saw: I'm going to ignore the oddity of auditors auditing themselves, I guess it's logically consistent that the first loaded auditor won't see any of this, but that subsequently loaded auditors will. Once auditor loading fails in your case we begin unloading that object. What I don't understand though is why is GLRO(dl_naudit) non-zero? If this is the first loaded auditor we don't increment GLRO(dl_naudit) until after we return from load_audit_module, so do_audit is false, and we never get here If this is the second loaded auditor we do set do_audit to true, but this means we already dereferenced ns->_ns_loaded->l_auditing already once to compute do_audit. What has subsequently happened to ns->_ns_loaded? For it to be null means we removed *every* object from the namespace, but we just established we had at least one already loaded successfully? Why would we remove more than just the current object that failed to load? Why would the current audit modules failure result in ns->_ns_loaded to be null? I *do* see other code that checks for nulls, for example: elf/dl-object.c: 28 /* Add the new link_map NEW to the end of the namespace list. */ 29 void 30 _dl_add_to_namespace_list (struct link_map *new, Lmid_t nsid) 31 { 32 /* We modify the list of loaded objects. */ 33 __rtld_lock_lock_recursive (GL(dl_load_write_lock)); 34 35 if (GL(dl_ns)[nsid]._ns_loaded != NULL) 36 { 37 struct link_map *l = GL(dl_ns)[nsid]._ns_loaded; 38 while (l->l_next != NULL) 39 l = l->l_next; 40 new->l_prev = l; 41 /* new->l_next = NULL; Would be necessary but we use calloc. */ 42 l->l_next = new; 43 } 44 else 45 GL(dl_ns)[nsid]._ns_loaded = new; Here we check to see if _ns_loaded is non-null and so something different, but here we might expect that it's the first link map being added to this namespace. I'd like to hear what you think might be going wrong here. > (gdb) bt > #0 _dl_close_worker (map=, force=force@entry=true) > at dl-close.c:785 > #1 0x00007fbe468d3eb3 in _dl_open (file=0x7ffde8ed3f30 "\253\003\aE\276\177", > mode=mode@entry=-1946157055, > caller_dlopen=caller_dlopen@entry=0x7fbe468c2330 , nsid=9, > nsid@entry=-1, argc=1, argv=, env=0x7ffde8ed4550) > at dl-open.c:906 > #2 0x00007fbe468c120e in dlmopen_doit (a=a@entry=0x7ffde8ed41e0) at rtld.c:660 > #3 0x00007fbe468dbd5e in _dl_catch_exception ( > exception=exception@entry=0x7ffde8ed4120, > operate=operate@entry=0x7fbe468c11d0 , > args=args@entry=0x7ffde8ed41e0) at dl-error-skeleton.c:208 > #4 0x00007fbe468dbe03 in _dl_catch_error (objname=objname@entry=0x7ffde8ed41d0, > errstring=errstring@entry=0x7ffde8ed41d8, > mallocedp=mallocedp@entry=0x7ffde8ed41cf, > operate=operate@entry=0x7fbe468c11d0 , > args=args@entry=0x7ffde8ed41e0) at dl-error-skeleton.c:227 > #5 0x00007fbe468c48e4 in load_audit_module (last_audit=, > name=0x7ffde8ed42b8 "tst-auditmanymod9.so") at rtld.c:973 > #6 load_audit_modules (audit_list=0x7ffde8ed4220, main_map=) > at rtld.c:1114 > #7 dl_main (phdr=, phnum=, > user_entry=, auxv=) at rtld.c:1679 > #8 0x00007fbe468da84b in _dl_sysdep_start ( > start_argptr=start_argptr@entry=0x7ffde8ed4520, > dl_main=dl_main@entry=0x7fbe468c2330 ) at ../elf/dl-sysdep.c:252 > #9 0x00007fbe468c1e81 in _dl_start_final (arg=0x7ffde8ed4520) at rtld.c:489 > #10 _dl_start (arg=0x7ffde8ed4520) at rtld.c:582 > #11 0x00007fbe468c1098 in _start () > from /home/fweimer/src/gnu/glibc/build/elf/ld-linux-x86-64.so.2 > > (gdb) print exception > $1 = {objname = 0x7fbe450703ab "/home/fweimer/src/gnu/glibc/build/libc.so.6", > errstring = 0x7fbe45070380 "cannot allocate memory in static TLS block", > message_buffer = 0x0} > > We cannot report another error at this point. Agreed. Thanks. -- Cheers, Carlos.