From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x231.google.com (mail-oi1-x231.google.com [IPv6:2607:f8b0:4864:20::231]) by sourceware.org (Postfix) with ESMTPS id BA5E93982419 for ; Fri, 13 May 2022 19:56:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BA5E93982419 Received: by mail-oi1-x231.google.com with SMTP id w130so11455536oig.0 for ; Fri, 13 May 2022 12:56:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=h0wNd547jK5/+lyPtDZRtcKpw88D6qouEH4sJxBFLdY=; b=ruHQ0I/iEXTc+o19zZrJJ1M+/ePoadgKBkfa15s+/ORARMumj9r5a58WmAg+9needB TLwS8nfV+zqwvlvygMwa9jWqn5/Uk5vIeExsoLuXj5wZf1jziHiMY7pUDYOnRDX2wL5G Yy0pXCEESVYrZ3GhlotX5e54eJtt+nHbDsIaxcV9nXOEL9egc6M+OoNwk63buob0b74X +DkwS0GlsbpHMq4+friFXEI9/efppt1wDBT4uYARvy/fXcZEawL41Mt8hgcepaRtRWcw gL4RWYqQz9xbfTlLeFNB/t/h9UzmmfA2Diks0hWKJ0cPcBXHjiJGbDfdgXc4IK5nx+JS AI4A== X-Gm-Message-State: AOAM532mwpgZZydQAqkdhQCUmQ0X+pwjZIKclib+/1yswVw8cvfG35lp k25SeD3QKh1E0ttIxTPkYvFyojTE7dYr9Q== X-Google-Smtp-Source: ABdhPJwxKvMC1iXnsMX9nxEGjLyll9Wyd0TvLDWCQv6bDUIU1jlX2pp9E08pwc8ROMDWIEdSFmtLVg== X-Received: by 2002:a05:6808:219c:b0:326:4456:d0be with SMTP id be28-20020a056808219c00b003264456d0bemr3116509oib.79.1652471795783; Fri, 13 May 2022 12:56:35 -0700 (PDT) Received: from ?IPV6:2804:431:c7ca:5fbd:b7af:daa5:1bf0:c543? ([2804:431:c7ca:5fbd:b7af:daa5:1bf0:c543]) by smtp.gmail.com with ESMTPSA id 66-20020a9d0f48000000b006060322125bsm1388077ott.43.2022.05.13.12.56.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 13 May 2022 12:56:35 -0700 (PDT) Message-ID: Date: Fri, 13 May 2022 16:56:32 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH v6 1/4] rtld: Use generic argv adjustment in ld.so [BZ #23293] Content-Language: en-US To: Szabolcs Nagy , libc-alpha@sourceware.org References: From: Adhemerval Zanella In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 May 2022 19:56:39 -0000 On 05/05/2022 10:58, Szabolcs Nagy via Libc-alpha wrote: > When an executable is invoked as > > ./ld.so [ld.so-args] ./exe [exe-args] > > then the argv is adujusted in ld.so before calling the entry point of > the executable so ld.so args are not visible to it. On most targets > this requires moving argv, env and auxv on the stack to ensure correct > stack alignment at the entry point. This had several issues: > > - The code for this adjustment on the stack is written in asm as part > of the target specific ld.so _start code which is hard to maintain. > > - The adjustment is done after _dl_start returns, where it's too late > to update GLRO(dl_auxv), as it is already readonly, so it points to > memory that was clobbered by the adjustment. This is bug 23293. > > - _environ is also wrong in ld.so after the adjustment, but it is > likely not used after _dl_start returns so this is not user visible. > > - _dl_argv was updated, but for this it was moved out of relro, which > changes security properties across targets unnecessarily. > > This patch introduces a generic _dl_start_args_adjust function that > handles the argument adjustments after ld.so processed its own args > and before relro protection is applied. > > The same algorithm is used on all targets, _dl_skip_args is now 0, so > existing target specific adjustment code is no longer used. The bug > affects aarch64, alpha, arc, arm, csky, ia64, nios2, s390-32 and sparc, > other targets don't need the change in principle, only for consistency. > > The GNU Hurd start code relied on _dl_skip_args after dl_main returned, > now it checks directly if args were adjusted and fixes the Hurd startup > data accordingly. > > Follow up patches can remove _dl_skip_args and DL_ARGV_NOT_RELRO. > > Tested on aarch64-linux-gnu and cross tested on i686-gnu. > --- > v6: > - don't pass start_argptr to _dl_main, just use _dl_argv-1. > - add comment for _dl_start_args_adjust. > - add assert checks to _dl_start_args_adjust and simplify it a bit. > v5: > - Hurd specific changes. > v4: > - New code is unconditionally used on all targets. > - Hide auxv adjustments behind HAVE_AUX_VECTOR. > - DL_NEED_START_ARGS_ADJUST macro is removed. > - _dl_skip_args is no longer unused. > - start_argptr is passed down to dl_main instead of using a global. > - moved aarch64 DL_ARGV_NOT_RELRO removal to separate patch. > v2: > - use p != NULL, and a_type != AT_NULL > - remove the confusing paragraph from the commit message. Looks ok, just a minor suggestion below. Reviewed-by: Adhemerval Zanella > --- > elf/rtld.c | 73 ++++++++++++++++++++++++++++------- > sysdeps/mach/hurd/dl-sysdep.c | 30 +++++++------- > 2 files changed, 73 insertions(+), 30 deletions(-) > > diff --git a/elf/rtld.c b/elf/rtld.c > index 3b2e05bf4c..b5070d453f 100644 > --- a/elf/rtld.c > +++ b/elf/rtld.c > @@ -1306,6 +1306,62 @@ rtld_setup_main_map (struct link_map *main_map) > return has_interp; > } > > +/* Adjusts the contents of the stack and related globals for the user > + entry point. The ld.so processed skip_args arguments and bumped > + _dl_argv and _dl_argc accordingly. Those arguments are removed from > + argv here. */ > +static void > +_dl_start_args_adjust (int skip_args) > +{ > + void **sp = (void **) (_dl_argv - skip_args - 1); Is it fully correctly to materialize the address for 'skip_args' equal to 0? I don't think it would matter anyway. > + void **p = sp + skip_args; > + > + if (skip_args == 0) > + return; > + > + /* Sanity check. */ > + intptr_t argc = (intptr_t) sp[0] - skip_args; > + assert (argc == _dl_argc); > + > + /* Adjust argc on stack. */ > + sp[0] = (void *) (intptr_t) _dl_argc; > + > + /* Update globals in rtld. */ > + _dl_argv -= skip_args; > + _environ -= skip_args; > + > + /* Shuffle argv down. */ > + do > + *++sp = *++p; > + while (*p != NULL); > + > + assert (_environ == (char **) (sp + 1)); > + > + /* Shuffle envp down. */ > + do > + *++sp = *++p; > + while (*p != NULL); > + > +#ifdef HAVE_AUX_VECTOR > + void **auxv = (void **) GLRO(dl_auxv) - skip_args; > + GLRO(dl_auxv) = (ElfW(auxv_t *)) auxv; /* Aliasing violation. */ > + assert (auxv == sp + 1); > + > + /* Shuffle auxv down. */ > + void *a, *b; /* Use a pair of pointers for an auxv entry. */ > + unsigned long a_type; > + do > + { > + a_type = ((ElfW(auxv_t) *) (p + 1))->a_type; > + a = *++p; > + b = *++p; > + *++sp = a; > + *++sp = b; > + }> + while (a_type != AT_NULL); > +#endif Maybe: ElfW(auxv_t) ax; do { p = (void**) ((uintptr_t) p + sizeof (ax)); sp = (void**) ((uintptr_t) sp + sizeof (ax)); memcpy (&ax, p, sizeof (ax)); memcpy (sp, &ax, sizeof (ax)); } while (ax.a_type != AT_NULL); Most targets will inline memcpy and if they do not we don't enable IFUNC for ld.so. > +} > + > static void > dl_main (const ElfW(Phdr) *phdr, > ElfW(Word) phnum, > @@ -1359,6 +1415,7 @@ dl_main (const ElfW(Phdr) *phdr, > rtld_is_main = true; > > char *argv0 = NULL; > + char **orig_argv = _dl_argv; > > /* Note the place where the dynamic linker actually came from. */ > GL(dl_rtld_map).l_name = rtld_progname; > @@ -1373,7 +1430,6 @@ dl_main (const ElfW(Phdr) *phdr, > GLRO(dl_lazy) = -1; > } > > - ++_dl_skip_args; > --_dl_argc; > ++_dl_argv; > } > @@ -1382,14 +1438,12 @@ dl_main (const ElfW(Phdr) *phdr, > if (state.mode != rtld_mode_help) > state.mode = rtld_mode_verify; > > - ++_dl_skip_args; > --_dl_argc; > ++_dl_argv; > } > else if (! strcmp (_dl_argv[1], "--inhibit-cache")) > { > GLRO(dl_inhibit_cache) = 1; > - ++_dl_skip_args; > --_dl_argc; > ++_dl_argv; > } > @@ -1399,7 +1453,6 @@ dl_main (const ElfW(Phdr) *phdr, > state.library_path = _dl_argv[2]; > state.library_path_source = "--library-path"; > > - _dl_skip_args += 2; > _dl_argc -= 2; > _dl_argv += 2; > } > @@ -1408,7 +1461,6 @@ dl_main (const ElfW(Phdr) *phdr, > { > GLRO(dl_inhibit_rpath) = _dl_argv[2]; > > - _dl_skip_args += 2; > _dl_argc -= 2; > _dl_argv += 2; > } > @@ -1416,14 +1468,12 @@ dl_main (const ElfW(Phdr) *phdr, > { > audit_list_add_string (&state.audit_list, _dl_argv[2]); > > - _dl_skip_args += 2; > _dl_argc -= 2; > _dl_argv += 2; > } > else if (! strcmp (_dl_argv[1], "--preload") && _dl_argc > 2) > { > state.preloadarg = _dl_argv[2]; > - _dl_skip_args += 2; > _dl_argc -= 2; > _dl_argv += 2; > } > @@ -1431,7 +1481,6 @@ dl_main (const ElfW(Phdr) *phdr, > { > argv0 = _dl_argv[2]; > > - _dl_skip_args += 2; > _dl_argc -= 2; > _dl_argv += 2; > } > @@ -1439,7 +1488,6 @@ dl_main (const ElfW(Phdr) *phdr, > && _dl_argc > 2) > { > state.glibc_hwcaps_prepend = _dl_argv[2]; > - _dl_skip_args += 2; > _dl_argc -= 2; > _dl_argv += 2; > } > @@ -1447,7 +1495,6 @@ dl_main (const ElfW(Phdr) *phdr, > && _dl_argc > 2) > { > state.glibc_hwcaps_mask = _dl_argv[2]; > - _dl_skip_args += 2; > _dl_argc -= 2; > _dl_argv += 2; > } > @@ -1456,7 +1503,6 @@ dl_main (const ElfW(Phdr) *phdr, > { > state.mode = rtld_mode_list_tunables; > > - ++_dl_skip_args; > --_dl_argc; > ++_dl_argv; > } > @@ -1465,7 +1511,6 @@ dl_main (const ElfW(Phdr) *phdr, > { > state.mode = rtld_mode_list_diagnostics; > > - ++_dl_skip_args; > --_dl_argc; > ++_dl_argv; > } > @@ -1511,7 +1556,6 @@ dl_main (const ElfW(Phdr) *phdr, > _dl_usage (ld_so_name, NULL); > } > > - ++_dl_skip_args; > --_dl_argc; > ++_dl_argv; > > @@ -1610,6 +1654,9 @@ dl_main (const ElfW(Phdr) *phdr, > /* Set the argv[0] string now that we've processed the executable. */ > if (argv0 != NULL) > _dl_argv[0] = argv0; > + > + /* Adjust arguments for the application entry point. */ > + _dl_start_args_adjust (_dl_argv - orig_argv); > } > else > { Ok. > diff --git a/sysdeps/mach/hurd/dl-sysdep.c b/sysdeps/mach/hurd/dl-sysdep.c > index 3cbe075615..8373962e62 100644 > --- a/sysdeps/mach/hurd/dl-sysdep.c > +++ b/sysdeps/mach/hurd/dl-sysdep.c > @@ -76,6 +76,7 @@ _dl_sysdep_start (void **start_argptr, > { > void go (intptr_t *argdata) > { > + char *orig_argv0; > char **p; > > /* Cache the information in various global variables. */ > @@ -84,6 +85,8 @@ _dl_sysdep_start (void **start_argptr, > _environ = &_dl_argv[_dl_argc + 1]; > for (p = _environ; *p++;); /* Skip environ pointers and terminator. */ > > + orig_argv0 = _dl_argv[0]; > + > if ((void *) p == _dl_argv[0]) > { > static struct hurd_startup_data nodata; > @@ -173,30 +176,23 @@ _dl_sysdep_start (void **start_argptr, > > /* The call above might screw a few things up. > > - First of all, if _dl_skip_args is nonzero, we are ignoring > - the first few arguments. However, if we have no Hurd startup > - data, it is the magical convention that ARGV[0] == P. The > + P is the location after the terminating NULL of the list of > + environment variables. It has to point to the Hurd startup > + data or if that's missing then P == ARGV[0] must hold. The > startup code in init-first.c will get confused if this is not > the case, so we must rearrange things to make it so. We'll > - overwrite the origional ARGV[0] at P with ARGV[_dl_skip_args]. > + recompute P and move the Hurd data or the new ARGV[0] there. > > - Secondly, if we need to be secure, it removes some dangerous > - environment variables. If we have no Hurd startup date this > - changes P (since that's the location after the terminating > - NULL in the list of environment variables). We do the same > - thing as in the first case but make sure we recalculate P. > - If we do have Hurd startup data, we have to move the data > - such that it starts just after the terminating NULL in the > - environment list. > + Note: directly invoked ld.so can move arguments and env vars. > > We use memmove, since the locations might overlap. */ > - if (__libc_enable_secure || _dl_skip_args) > - { > - char **newp; > > - for (newp = _environ; *newp++;); > + char **newp; > + for (newp = _environ; *newp++;); > > - if (_dl_argv[-_dl_skip_args] == (char *) p) > + if (newp != p || _dl_argv[0] != orig_argv0) > + { > + if (orig_argv0 == (char *) p) > { > if ((char *) newp != _dl_argv[0]) > { Looks ok, but I can't really voucher for hurd code.