From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x22d.google.com (mail-oi1-x22d.google.com [IPv6:2607:f8b0:4864:20::22d]) by sourceware.org (Postfix) with ESMTPS id A601B3858D37 for ; Thu, 20 Apr 2023 20:38:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A601B3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x22d.google.com with SMTP id 5614622812f47-38e04d1b2b4so801882b6e.3 for ; Thu, 20 Apr 2023 13:38:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1682023087; x=1684615087; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=kuplub1XxpzrvXE3Q9yz6o0CM6hB+JKtS33OBpcjM2o=; b=YD4obLBM9g1pDZuLPFq3R/czSwXAjouGy8snrR2rfFjBBfQMUzB+7F5Bal6b4OdekG Kl2k20GUF5tHFEZQQpCbwE84oAC5llidCjHPswhwNWPNkjoyVF1+BwDaCN0DRl0T1cXc Y/FFZLrM4Z+qeE6EK9w6/j8Xns/lU0h/YWunEE2MQFyHBRnZIPeHPqLI6PhzqLEz99Hv NyM+ihmEmkF/vDq32wUZxDJubyDiEThhQSlrU96aIDrzxUTszdzxAYOCejfN8+Ph7x6u WheqMmKTDjRuIRgDNDAOzyN4wdAZ+5r1PXYgmaWzbEwIcY8EIzCtaWBWZeBs2QtlRm0y 0x1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682023087; x=1684615087; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kuplub1XxpzrvXE3Q9yz6o0CM6hB+JKtS33OBpcjM2o=; b=E3P65Tib0w4PKyBrEbPS3jr7uSROwlfO54mP1C5oskeySYSImEtj+1VplTl3Jo98ld KxDjgamSIRslG7yAUmrq2qor0xv0nLjjTbNutCze2sxC+JvlCWsKLTKVSI/znItFt3Ja uOI3I3hO7krRPDhLsQwPacVoaMPTn84lAklhIU8zdHJiCL44s6Z9K/SRuqq1Ae74P7Wi teFpkGpbIIYGseqet48ILAC11XrSo/vtsK1P3m9Hj7Yis49eZsiIl/hCLNAbkaX2iRcb yPXGsMwOaSQtxWIl2jmMJUdmzom+j2kxSsDHWcUVsazdrYMFqseNCp4x1zmtzKuEhm4r 2N7Q== X-Gm-Message-State: AAQBX9cWkcc3/jxDZa0yO15XzglN8XJhwpFtLRr3d18//tkFY/Pc9Rkk gS0lzC1dhm9oV4IdAGTEZB2FiA== X-Google-Smtp-Source: AKy350YfAAQ1HYU76dGyrUhLaDfwgTygMNLT7yC42TtQrcfFh5SsxcGfky0+8j8EEidozaPZZFTsug== X-Received: by 2002:a05:6808:2395:b0:38b:ecd8:6938 with SMTP id bp21-20020a056808239500b0038becd86938mr1869581oib.4.1682023086929; Thu, 20 Apr 2023 13:38:06 -0700 (PDT) Received: from ?IPV6:2804:1b3:a7c3:333:20b7:b016:1b7f:fd25? ([2804:1b3:a7c3:333:20b7:b016:1b7f:fd25]) by smtp.gmail.com with ESMTPSA id o125-20020acabe83000000b0038c2e1bdf2asm941542oif.36.2023.04.20.13.38.03 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 20 Apr 2023 13:38:06 -0700 (PDT) Message-ID: Date: Thu, 20 Apr 2023 17:38:01 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [VERY RFC PATCH 2/2] hurd: Make it possible to call memcpy very early Content-Language: en-US To: "H.J. Lu" , Sergey Bugaev Cc: libc-alpha@sourceware.org, bug-hurd@gnu.org, Samuel Thibault , Luca References: <20230420184220.300862-1-bugaevc@gmail.com> <20230420184220.300862-2-bugaevc@gmail.com> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 20/04/23 17:25, H.J. Lu via Libc-alpha wrote: > On Thu, Apr 20, 2023 at 11:43 AM Sergey Bugaev wrote: >> >> Normally, in static builds, the first code that runs is _start, in e.g. >> sysdeps/x86_64/start.S, which quickly calls __libc_start_main, passing >> it the argv etc. Among the first things __libc_start_main does is >> initializing the tunables (based on env), then CPU features, and then >> calls _dl_relocate_static_pie (). Specifically, this runs ifunc >> resolvers to pick, based on the CPU features discovered earlier, the >> most suitable implementation of "string" functions such as memcpy. >> >> Before that point, calling memcpy (or other ifunc-resolved functions) >> will not work. >> >> In the Hurd port, things are more complex. In order to get argv/env for >> our process, glibc normally needs to do an RPC to the exec server, >> unless our args/env are already located on the stack (which is what >> happens to bootstrap processes spawned by GNU Mach). Fetching our >> argv/env from the exec server has to be done before the call to >> __libc_start_main, since we need to know what our argv/env are to pass >> them to __libc_start_main. >> >> On the other hand, the implementation of the RPC (and other initial >> setup needed on the Hurd before __libc_start_main can be run) is not >> very trivial. In particular, it may (and on x86_64, will) use memcpy. >> But as described above, calling memcpy before __libc_start_main can not >> work, since the GOT entry for it is not yet initialized at that point. >> >> Work around this by pre-filling the GOT entry with the baseline version >> of memcpy, __memcpy_sse2_unaligned. This makes it possible for early >> calls to memcpy to just work. Once _dl_relocate_static_pie () is called, >> the baseline version will get replaced with the most suitable one, and >> that's what subsequent calls of memcpy are going to call. >> >> Also, apply the same treatment to __stpncpy, which can also be used by >> the RPCs (see mig_strncpy.c), and is an ifunc-resolved function on both >> x86_64 and i386. >> >> Tested on x86_64-gnu (!). >> >> Signed-off-by: Sergey Bugaev >> --- >> >> Please tell me: >> >> * if the approach is at all sane >> * if there's a better way to do this without hardcoding >> "__memcpy_sse2_unaligned" >> * are the GOT entries for indirect functions supposed to be statically >> initialized to anything (in the binary)? if yes, why? if not, why is >> PROGBITS and not NOBITS? >> * am I doing all this _GLOBAL_OFFSET_TABLE_, @GOT, @GOTOFF, @GOTPCREL >> correctly? >> * should there be a !PIC version as well? does the GOT exist under >> !PIC (to access indirect functions), and if it does then how do I >> access it? it would seem gcc just generates a direct $function even >> for indirect functions in this case. >> >> sysdeps/mach/hurd/i386/static-start.S | 7 +++++++ >> sysdeps/mach/hurd/x86_64/static-start.S | 8 ++++++++ >> 2 files changed, 15 insertions(+) >> >> diff --git a/sysdeps/mach/hurd/i386/static-start.S b/sysdeps/mach/hurd/i386/static-start.S >> index c5d12645..1b1ae559 100644 >> --- a/sysdeps/mach/hurd/i386/static-start.S >> +++ b/sysdeps/mach/hurd/i386/static-start.S >> @@ -19,6 +19,13 @@ >> .text >> .globl _start >> _start: >> +#ifdef PIC >> + call __x86.get_pc_thunk.bx >> + addl $_GLOBAL_OFFSET_TABLE_, %ebx >> + leal __stpncpy_ia32@GOTOFF(%ebx), %eax >> + movl %eax, __stpncpy@GOT(%ebx) >> +#endif >> + >> call _hurd_stack_setup >> xorl %edx, %edx >> jmp _start1 >> diff --git a/sysdeps/mach/hurd/x86_64/static-start.S b/sysdeps/mach/hurd/x86_64/static-start.S >> index 982d3d52..81b3c0ac 100644 >> --- a/sysdeps/mach/hurd/x86_64/static-start.S >> +++ b/sysdeps/mach/hurd/x86_64/static-start.S >> @@ -19,6 +19,14 @@ >> .text >> .globl _start >> _start: >> + >> +#ifdef PIC >> + leaq __memcpy_sse2_unaligned(%rip), %rax >> + movq %rax, memcpy@GOTPCREL(%rip) >> + leaq __stpncpy_sse2_unaligned(%rip), %rax >> + movq %rax, __stpncpy@GOTPCREL(%rip) >> +#endif >> + >> call _hurd_stack_setup >> xorq %rdx, %rdx >> jmp _start1 >> -- >> 2.40.0 >> > > Doesn't it disable IFUNC for memcpy and stpncpy? > Can't you use a similar strategy done by 5355f9ca7b10183ce06e8a18003ba30f43774858 ?