From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 369863858C39 for ; Thu, 13 Oct 2022 20:45:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 369863858C39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665693926; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i8DzcP3gt9uHKaVHJ6PEy3AIYc6E38SUzWHhGVc0lx4=; b=HloQPJHvOnm0tptwDigKarhX0w/zu6d7W9+9L98Jb3nEy1V1xwJ3gc98KDY5zv24BPSD4v rrJLHG3YrC/GV89nSq+3XSri4G+m0EQ2rtwJiNL7SG1GXK4QyW6HCYgEubvdM6rfEA2yA3 bxCEeg+NkyBi5c3rvJLIAFUWXunbYZY= Received: from mail-il1-f200.google.com (mail-il1-f200.google.com [209.85.166.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-394-kmMDbmgiOTa26Hykwy_xOA-1; Thu, 13 Oct 2022 16:45:25 -0400 X-MC-Unique: kmMDbmgiOTa26Hykwy_xOA-1 Received: by mail-il1-f200.google.com with SMTP id x6-20020a056e021bc600b002fc96f780e7so2448582ilv.10 for ; Thu, 13 Oct 2022 13:45:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=i8DzcP3gt9uHKaVHJ6PEy3AIYc6E38SUzWHhGVc0lx4=; b=ZXc3h6KbvmBj5TNOF6IoEIS6DJnFnKdXw4ESyLbTyDoVFbyNH3HMTSvEuxs8lJ8b5D szbidi5SjXP9RH5z+ow7CR1IqKSSz8Gnesxw3PBvwSJvSEwgnYuhP6OYYl4uYw2tNs+i NCv1NbAr3vuSfuwkBEk16kEuIvAm9BhqoB0/3QCHf1E7cxmDQWtmFM7jgPW8SnqbjjZ+ WqsSGp33qQ5KSK4NsVOYicoVqQ/03SOVirIuT68YQFbgTNaFb8wrcW/clp/YyqXYK2Pr cdZjcXqFdHkM4U/Mf3O3kcPs87OHbnJTnQMlGkOdQj/ek/9Ycx2FzCMib8u53+ZQIck0 KElw== X-Gm-Message-State: ACrzQf2bHE8S7AyxtqoFm2A4VTQH3NXa7cZSEfoUoxRgecQLCqnXAzQN nK/rVydYBeo0V3TB/YsiekeUkwV3gu74XAnAYwKf+8ow4sF3cZZWiBVjWKG35niT88B9pFgnCvv AFh90r+Y5QF0vhAccQJDa X-Received: by 2002:a92:502:0:b0:2fc:5e54:b4a6 with SMTP id q2-20020a920502000000b002fc5e54b4a6mr889436ile.41.1665693925003; Thu, 13 Oct 2022 13:45:25 -0700 (PDT) X-Google-Smtp-Source: AMsMyM50Q2KHoIv3jX/jbsIfbPK9vSbql87xK0Tn/4Ir4ZsDlLQ/a90A2QlyH9oWuTODD35uNfnh0Q== X-Received: by 2002:a92:502:0:b0:2fc:5e54:b4a6 with SMTP id q2-20020a920502000000b002fc5e54b4a6mr889425ile.41.1665693924707; Thu, 13 Oct 2022 13:45:24 -0700 (PDT) Received: from [192.168.0.241] (192-0-145-146.cpe.teksavvy.com. [192.0.145.146]) by smtp.gmail.com with ESMTPSA id e6-20020a056602044600b006bc0f8e18d4sm253957iov.34.2022.10.13.13.45.22 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 13 Oct 2022 13:45:23 -0700 (PDT) Message-ID: Date: Thu, 13 Oct 2022 16:45:21 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH] elf: Do not completely clear reused namespace in dlmopen (bug 29600) To: Florian Weimer , libc-alpha@sourceware.org, Adhemerval Zanella References: <87pmfn1ftc.fsf@oldenburg.str.redhat.com> From: Carlos O'Donell Organization: Red Hat In-Reply-To: <87pmfn1ftc.fsf@oldenburg.str.redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 9/22/22 13:26, Florian Weimer via Libc-alpha wrote: > The data in the _ns_debug member must be preserved, otherwise > _dl_debug_initialize enters an infinite loop. To be conservative, > only clear the libc_map member for now, to fix bug 29528. > > Fixes commit d0e357ff45a75553dee3b17ed7d303bfa544f6fe > ("elf: Call __libc_early_init for reused namespaces (bug 29528)"), > by reverting most of it. > > Tested on i686-linux-gnu and x86_64-linux-gnu. Adhemerval had the review status on this patch in patchwork, but I'm CC'ing him here to notify him that I've done a review. I had to do more than average due-diligence here since I was concerned about the namespace reuse and what might need to be initialized. LGTM. Reviewed-by: Carlos O'Donell Tested-by: Carlos O'Donell > --- > elf/dl-open.c | 14 ++++++-------- > elf/tst-dlmopen-twice.c | 28 ++++++++++++++++++++++++---- > 2 files changed, 30 insertions(+), 12 deletions(-) > > diff --git a/elf/dl-open.c b/elf/dl-open.c > index 46e8066fd8..e7db5e9642 100644 > --- a/elf/dl-open.c > +++ b/elf/dl-open.c > @@ -844,15 +844,13 @@ _dl_open (const char *file, int mode, const void *caller_dlopen, Lmid_t nsid, > _dl_signal_error (EINVAL, file, NULL, N_("\ > no more namespaces available for dlmopen()")); > } > + else if (nsid == GL(dl_nns)) > + { > + __rtld_lock_initialize (GL(dl_ns)[nsid]._ns_unique_sym_table.lock); > + ++GL(dl_nns); > + } OK. We are in the LM_ID_NEWLM case. A new namespace is being created. OK. We reinitialize _ns_unique_sym_table.lock (one of two things) > > - if (nsid == GL(dl_nns)) > - ++GL(dl_nns); > - > - /* Initialize the new namespace. Most members are > - zero-initialized, only the lock needs special treatment. */ > - memset (&GL(dl_ns)[nsid], 0, sizeof (GL(dl_ns)[nsid])); OK. We don't zero the namespace structure. We want our own r_brk to remain non-null. > - __rtld_lock_initialize (GL(dl_ns)[nsid]._ns_unique_sym_table.lock); > - > + GL(dl_ns)[nsid].libc_map = NULL; OK. Reset libc_map to NULL in order that dl_open_worker_begin can re-inint libc if needed (two of two things). > _dl_debug_update (nsid)->r_state = RT_CONSISTENT; > } > /* Never allow loading a DSO in a namespace which is empty. Such I wanted to see the infinite loop for myself to be able to review this patch, so I backed out the change and debugged tst-dlmopen-twice. I confirmed that the loop is in the debug initialization of _r_debug_extended: #0 _dl_debug_initialize (ldbase=ldbase@entry=0, ns=ns@entry=2) at dl-debug.c:77 #1 0x00007ffff7fd66c5 in dl_open_worker_begin (a=a@entry=0x7fffffffd1c0) at dl-open.c:530 #2 0x00007ffff7f26909 in __GI__dl_catch_exception (exception=, operate=, args=) at /mnt/ssd/carlos/src/glibc-review/elf/dl-error-skeleton.c:208 #3 0x00007ffff7fd5ea6 in dl_open_worker (a=a@entry=0x7fffffffd1c0) at dl-open.c:782 #4 0x00007ffff7f26909 in __GI__dl_catch_exception (exception=, operate=, args=) at /mnt/ssd/carlos/src/glibc-review/elf/dl-error-skeleton.c:208 #5 0x00007ffff7fd6288 in _dl_open (file=, mode=, caller_dlopen=0x7ffff7fc05b9, nsid=2, argc=2, argv=0x7fffffffd698, env=0x7fffffffd6b0) at dl-open.c:886 #6 0x00007ffff7e62b80 in dlmopen_doit (a=a@entry=0x7fffffffd430) at dlmopen.c:61 #7 0x00007ffff7f26909 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd390, operate=, args=) at /mnt/ssd/carlos/src/glibc-review/elf/dl-error-skeleton.c:208 #8 0x00007ffff7f269af in __GI__dl_catch_error (objname=0x7fffffffd3f0, errstring=0x7fffffffd3f8, mallocedp=0x7fffffffd3ef, operate=, args=) at /mnt/ssd/carlos/src/glibc-review/elf/dl-error-skeleton.c:227 #9 0x00007ffff7e62826 in _dlerror_run (operate=operate@entry=0x7ffff7e62b20 , args=args@entry=0x7fffffffd430) at dlerror.c:138 #10 0x00007ffff7e62c76 in dlmopen_implementation (dl_caller=, mode=, file=, nsid=) at dlmopen.c:76 #11 ___dlmopen (nsid=, file=, mode=) at dlmopen.c:86 66 else if (DL_NNS > 1) 67 { 68 r = &GL(dl_ns)[ns]._ns_debug; 69 if (r->base.r_brk == 0) 70 { 71 /* Add the new namespace to the linked list. After a namespace 72 is initialized, r_brk becomes non-zero. A namespace becomes 73 empty (r_map == NULL) when it is unused. But it is never 74 removed from the linked list. */ 75 struct r_debug_extended *p; 76 for (pp = &_r_debug_extended.r_next; 77 (p = *pp) != NULL; 78 pp = &p->r_next) Here we loop forever because of the self-reference. (gdb) 77 (p = *pp) != NULL; (gdb) 78 pp = &p->r_next) (gdb) 77 (p = *pp) != NULL; (gdb) 78 pp = &p->r_next) (gdb) 77 (p = *pp) != NULL; (gdb) 78 pp = &p->r_next) .... 79 ; 80 81 r->base.r_version = 2; 82 } 83 } The self-reference is added because a namespace is only added once to the _r_debug_extended structure, and we need to keep that invariant. Since the namespace was added once already, when we go to add it again we create an loop with r_next. AFAICT it is the clearing of r_brk which breaks the expectation, causes the namespace to be added again, and creates the loop. Confirmed that with the patch we correctly initialize _r_debug_extended and correctly track r_next and add namespaces as they first appear, never removing them. > diff --git a/elf/tst-dlmopen-twice.c b/elf/tst-dlmopen-twice.c > index 449f3c8fa9..70c71fe19c 100644 > --- a/elf/tst-dlmopen-twice.c > +++ b/elf/tst-dlmopen-twice.c > @@ -16,18 +16,38 @@ OK. Test looks good, adds recursion to the test to create a dlmopen/dlmclose/dmopen > License along with the GNU C Library; if not, see > . */ > > -#include > +#include > #include > +#include > > -static int > -do_test (void) > +/* Run the test multiple times, to check finding a new namespace while > + another namespace is already in use. This used to trigger bug 29600. */ > +static void > +recurse (int depth) > { > - void *handle = xdlmopen (LM_ID_NEWLM, "tst-dlmopen-twice-mod1.so", RTLD_NOW); > + if (depth == 0) > + return; > + > + printf ("info: running at depth %d\n", depth); > + void *handle = xdlmopen (LM_ID_NEWLM, "tst-dlmopen-twice-mod1.so", > + RTLD_NOW); > xdlclose (handle); > handle = xdlmopen (LM_ID_NEWLM, "tst-dlmopen-twice-mod2.so", RTLD_NOW); > int (*run_check) (void) = xdlsym (handle, "run_check"); > TEST_COMPARE (run_check (), 0); > + recurse (depth - 1); > xdlclose (handle); > +} > + > +static int > +do_test (void) > +{ > + /* First run the test without nesting. */ > + recurse (1); > + > + /* Then with nesting. The constant needs to be less than the > + internal DL_NNS namespace constant. */ > + recurse (10); > return 0; > } > > > base-commit: 340097d0b50eff9d3058e06c6989ae398c653d4a > -- Cheers, Carlos.