From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x2d.google.com (mail-oa1-x2d.google.com [IPv6:2001:4860:4864:20::2d]) by sourceware.org (Postfix) with ESMTPS id 78F263858C83 for ; Fri, 21 Apr 2023 11:51:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 78F263858C83 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-oa1-x2d.google.com with SMTP id 586e51a60fabf-187b70ab997so10549838fac.0 for ; Fri, 21 Apr 2023 04:51:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1682077907; x=1684669907; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=rk8ivSS7dpB3q4TdzyoYoauyC08BUgyys7G1WzFG+/o=; b=sfvsvGRkdLyTgQzvBBoDuACgnL+u0zIVGYEwYHn7f/iV9049kqjwFji5ykc1+T8SYV TY2owJW9RP8Vz2NLKPDqZOmcojGVpYOlO5KbJhHeo3Jra9tAM2XjFdy+XleOYTZCVJ61 6Zfe6X7z/l6+CgxTiPBcpy0k5OU/nqktbeN/5jbfdbxmys3fK2+d/tdFemvymC54anxx 7hz2CJSNS+udJzURz1phjm3RUM3QDdMDWglwxm2wsnY0quRyrpLYTRIgApitY3LmJX6r m8D+W2XlS+Gpp6sjldNjR+0OhL0Tpl3AAECNOlPZm9vfPIDr0KrwkIEP5C/8KcyEO43H KSGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682077907; x=1684669907; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rk8ivSS7dpB3q4TdzyoYoauyC08BUgyys7G1WzFG+/o=; b=FlDNN2uOe+C7QlR1VLU0v75hCWEvttkJf1ousJwN4pa/VUxFlLMm6BcBXuNuA/1+EG HS/3i6ij8kSmpGaozTADnZYYouEYw7BveRJY/9D4UOCJiPhHbP6xO4GyUXCHidmm5jbK UZRXPIkYNj8xEHaa2+4GntF1yEUjJTQXeRhMe8nShrw+/jcRBo1PuQCvLS/PGylDHTGr CPcAoQPrBa7DKhhdy3eJFKTCQI4MEgLoDaw/pZi2NG3iy9eXjxUr6p3VQVZSbweOYD2k vVIh53gNwZ3xN0pjnoI8fd9zrlJUqse9u/DjLCDLdtz0fs6YxvQYs3EiY5wBUohzsuiA b/pQ== X-Gm-Message-State: AAQBX9c5cJ/Ox0qItlh0cxAcqPOaZv+xhZjuHLjstYmsgaUY1j1EcCyu NLDS/WhbRVDQ+1mI+i1bjuSn68lqHS9Hv3ZSz7w= X-Google-Smtp-Source: AKy350aOIVFhdPtMAkgbVfLqNUSuEGYCza1tOtaDyikVYD7g4Z+vw1SR++UI2gHkjZ88s0m0IuzZG9TKcCWsSfuPx3c= X-Received: by 2002:a05:6808:138d:b0:383:e7ad:efaf with SMTP id c13-20020a056808138d00b00383e7adefafmr2961029oiw.8.1682077906604; Fri, 21 Apr 2023 04:51:46 -0700 (PDT) MIME-Version: 1.0 References: <20230420184220.300862-1-bugaevc@gmail.com> <20230420184220.300862-2-bugaevc@gmail.com> In-Reply-To: From: Sergey Bugaev Date: Fri, 21 Apr 2023 14:51:35 +0300 Message-ID: Subject: Re: [VERY RFC PATCH 2/2] hurd: Make it possible to call memcpy very early To: "H.J. Lu" Cc: libc-alpha@sourceware.org, bug-hurd@gnu.org, Samuel Thibault , Luca Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, On Thu, Apr 20, 2023 at 11:26=E2=80=AFPM H.J. Lu wrot= e: > Doesn't it disable IFUNC for memcpy and stpncpy? I was hoping you'd tell me whether it does :| I *think* on i386 it does indeed (so I'd need to rework that part of the patch), but not on x86_64. This is based on the following observations: 1. in glibc source, sysdeps/i386/dl-irel.h does this: static inline void __attribute ((always_inline)) elf_irel (const Elf32_Rel *reloc) { Elf32_Addr *const reloc_addr =3D (void *) reloc->r_offset; const unsigned long int r_type =3D ELF32_R_TYPE (reloc->r_info); if (__glibc_likely (r_type =3D=3D R_386_IRELATIVE)) { Elf32_Addr value =3D elf_ifunc_invoke(*reloc_addr); *reloc_addr =3D value; } else __libc_fatal ("Unexpected reloc type in static binary.\n"); } i.e. reading the ifunc ptr from the relocated address itself, whereas on x86_64 it's: static inline void __attribute ((always_inline)) elf_irela (const ElfW(Rela) *reloc) { ElfW(Addr) *const reloc_addr =3D (void *) reloc->r_offset; const unsigned long int r_type =3D ELFW(R_TYPE) (reloc->r_info); if (__glibc_likely (r_type =3D=3D R_X86_64_IRELATIVE)) { ElfW(Addr) value =3D elf_ifunc_invoke(reloc->r_addend); *reloc_addr =3D value; } else __libc_fatal ("Unexpected reloc type in static binary.\n"); } i.e. the ifunc resolver is stored in the addend, and the initial value of *reloc_addr is ignored. Checking arm and aarch64, I see that arm uses *reloc_addr like i386, and aarch64 uses r_addend like x86_64. But (unlike i386 and like x86_64) arm also has an ifunc relocation for memcpy, so (if someone was to work on a arm-gnu port) we would still have the same issue there, and this approach wouldn't work -- but see below. 2. When dumping relocations with readelf --wide --relocs, for the x86_64-gnu build I see the addends vary, but for i386-gnu they're just empty. That means readelf considers R_X86_64_IRELATIVE to be a rela, and R_386_IRELATIVE to be a rel. 3. When looking at the initial values of the GOT entries, on i386 they do point to ifunc resolvers; on x86_64 they don't seem to be. 4. I've now tried asking qemu for a better CPU, and sure enough, I get the GOT entry pointing to __memcpy_avx_unaligned_erms. Here's a little demo: (gdb) bt #0 __memcpy_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:256 #1 0x00000000004b48d8 in __device_write_inband (device=3Ddevice@entry=3D6, mode=3Dmode@entry=3D0, recnum=3Drecnum@entry=3D= 0, data=3Ddata@entry=3D0x4db0f2 "\r\n", dataCnt=3DdataCnt@entry=3D2, bytes_written=3Dbytes_written@entry=3D0xbffffcdc) at /home/sergey/dev/crosshurd64/src/glibc/build/mach/RPC_device_write_inband.c= :158 #2 0x0000000000400d6b in write_some (to_write=3D2, p=3D0x4db0f2 "\r\n") at devstream.c:45 #3 write_crlf () at devstream.c:58 #4 devstream_write (cookie=3D, buffer=3D0x20000d30 "Well hello friends!\n", n=3D20) at devstream.c:70 #5 0x0000000000424841 in _IO_cookie_write (fp=3D0x20000c10, buf=3D, size=3D20) at iofopncook.c:59 #6 0x0000000000425234 in new_do_write (fp=3D0x20000c10, data=3D0x20000d30 "Well hello friends!\n", to_do=3Dto_do@entry=3D20) at /home/sergey/dev/crosshurd64/src/glibc/libio/libioP.h:1031 #7 0x0000000000425959 in _IO_new_do_write (fp=3D, data=3D, to_do=3D20) at fileops.c:425 #8 0x00000000004266e0 in _IO_new_file_sync (fp=3D0x20000c10) at fileops.c:= 798 #9 0x0000000000424542 in _IO_fflush (fp=3D0x20000c10) at /home/sergey/dev/crosshurd64/src/glibc/libio/libioP.h:1031 #10 0x0000000000400bea in main (argc=3D2, argv=3D0xbfffffa8) at /home/sergey/dev/mach-bootstrap-hello.c:69 #11 0x000000000040de73 in __libc_start_call_main (argv=3D0xbfffffa8, argc=3D2, main=3D0x400ad2
) at ../sysdeps/generic/libc_start_call_main.h:23 #12 __libc_start_main_impl (main=3D, argc=3D2, argv=3D0xbfffffa8, init=3D, fini=3D, rtld_fini=3D, stack_end=3D) at ../csu/libc-start.c:360 #13 0x0000000000400961 in _start1 () at ../sysdeps/x86_64/start.S:115 Actually maybe we could make this hack work for the architectures that do use *reloc_addr too: instead of just rewriting the GOT entry and leaving it at that, we'd restore the original pointer (i.e. __memcpy_ifunc) after we've done the early Hurd-specific setup, right before jumping to _start1. Maybe this awesome/horrible trick could be even used to enable ifunc-selected memcpy for i386 -- and not only on the Hurd, but for i386-linux-gnu as well? To reiterate: set the GOT entry to a known-good baseline version very early, then call memcpy the usual way all you like, then before doing _dl_relocate_static_pie reset the GOT entry back to the ifunc resolver. Please tell me if what I'm saying makes sense, I may sound confident, but I'm really not. This really needs someone way more experienced than me to look into it and judge. Sergey