From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=e7gw=AE=gmail.com=bugaevc@sourceware.org>
Received: from mail-ot1-x331.google.com (mail-ot1-x331.google.com [IPv6:2607:f8b0:4864:20::331])
	by sourceware.org (Postfix) with ESMTPS id 68A763858C33
	for <libc-alpha@sourceware.org>; Thu, 13 Apr 2023 10:03:10 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 68A763858C33
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-ot1-x331.google.com with SMTP id v9-20020a05683024a900b006a42896c456so1859819ots.8
        for <libc-alpha@sourceware.org>; Thu, 13 Apr 2023 03:03:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1681380189; x=1683972189;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=C41jj+iloKFWlYYQwD21Ihx3mKtO48VEfKk5yIj8EN0=;
        b=HtMmeYZe9k4+zE8s+IcnhZDbKukZV2o6qANg12NfNFnCv0ei3MOfo9BJwWKaoR8+jm
         SC9wyouC91f7qO4eBNiitIyxvuyv1Cng5FbjhjmnN8Agglqll2NBArTzurv1cedALtS8
         SdwJKOokiiUcv88zR8bRRqt2aMBJwvIiQBqVpbYvpfUmOGVdmvUCXAPkku3gcFLGCOUV
         Ur6rRxa3sHpniWgXdwYcdJD3eMsIBMiMUpi/n4WN+O/B6Xh/IK/pOSyIna2QWxs1lN6l
         8SOw/6IWdn3XGvSSpHI+9HyCo5h6KWuOtPFQ3NzQXArUs/L/pF+DXUlu5tsHRn2AFuOo
         RBUg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1681380189; x=1683972189;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=C41jj+iloKFWlYYQwD21Ihx3mKtO48VEfKk5yIj8EN0=;
        b=I0EvQUFxV3DaniVsEe8xZl5PgA3fuQDwcmKilVD2E3S6pT45/gvJXCTwekV01mJXXU
         YMKeEqfBEC1FcQyLN6s8MDv4FG6FiPCiCAqjJ5jz3S0yjSgB2PwtLSCohiQZ3auPrFUV
         +XzcsVufPUPhUTTbam0cQxF03dTf4dBPKq49kmMbf94TSmwOmtzPzGCDbZw7s1/xkOFn
         9wvih+Q0HoMDsrExPmKyNUL+Ley3qJNtNboVqcNMCNptRFtUuKLjdxP1UmbUojqNrPhl
         B2zalOB5wxT6yxVK83sTRrmesVvt1zVT1rUKP5ZVl6Nam7TKKUU7mjADiO+vEz//K7Hv
         yR8Q==
X-Gm-Message-State: AAQBX9dhzsYh3gQpKb6XEB23f6v19dwzyGzqBctoRROcEfO8HvfpQ6Ox
	t1BKcxaurxq2D5Yt+S7WmBu0vuibZDF1/cpaA9kj52F0bXQ=
X-Google-Smtp-Source: AKy350bYj6ZCxzVTSzn/qbKCIJvfOaxHKvKbgcEXFkGwlKSfUODMn/Bf2rtfdBo+UEo0teKiLdjkA4NV9/wTOW/wAF0=
X-Received: by 2002:a9d:7346:0:b0:6a4:2f50:74fb with SMTP id
 l6-20020a9d7346000000b006a42f5074fbmr377612otk.7.1681380189345; Thu, 13 Apr
 2023 03:03:09 -0700 (PDT)
MIME-Version: 1.0
References: <CAN9u=HeNejJ91mRCy-oZrDKmWZC6KfYyRf0Mz=q9v=LAxf_=sw@mail.gmail.com>
 <CAN9u=HdxsLpupwo9937XyWwrKyUQSHUMK8EaBz_YyJFga7gYeg@mail.gmail.com> <20230412234657.ntztyz7iau55lcwt@begin>
In-Reply-To: <20230412234657.ntztyz7iau55lcwt@begin>
From: Sergey Bugaev <bugaevc@gmail.com>
Date: Thu, 13 Apr 2023 13:02:58 +0300
Message-ID: <CAN9u=HdW51ajNDh1C85ifTGgHSnayDG-8TLss26D2-RX9Kosdg@mail.gmail.com>
Subject: Re: [RFC PATCH glibc 24/34] hurd: Only check for TLS initialization
 inside rtld or in static builds
To: Samuel Thibault <samuel.thibault@gnu.org>
Cc: libc-alpha@sourceware.org, bug-hurd@gnu.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,URIBL_BLACK autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <libc-alpha.sourceware.org>

Wow, this is great, thank you! You've really gone above and beyond
compared to what I expected you to do.

Some replies below; will reply to other points later.

On Thu, Apr 13, 2023 at 2:47=E2=80=AFAM Samuel Thibault <samuel.thibault@gn=
u.org> wrote:
> > Maybe you're building with some flags that affect this? I'm only doing
> > ../configure.
>
> I'm using
>
> ../configure --prefix=3D --enable-pt_chown

Yeah, that shouldn't influence anything vs what I have.

> I have uploaded the build result of master +
> b37899d34d2190ef4b454283188f22519f096048 restored on:
>
> https://dept-info.labri.fr/~thibault/tmp/libc.so.0.3
> https://dept-info.labri.fr/~thibault/tmp/ld.so
> https://dept-info.labri.fr/~thibault/tmp/test-as-const-rtld-sizes
>
> you can run it by hand with
> ./ld.so --library-path $PWD ./test-as-const-rtld-sizes
>
> It hangs on my system. I have put the core dump on
>
> https://dept-info.labri.fr/~thibault/tmp/core.18601
>
> which can be inspected with
>
> gdb ./ld.so core.18601

Thank you, I'm going to take a look.

> Running live gdb ./ld.so 18529, I get:
>
> (gdb) thread apply all bt
>
> Thread 2 (Thread 18529.2):
> #0  0x0102aa3c in __GI___mach_msg_trap () at /usr/src/glibc-upstream/buil=
d/mach/mach_msg_trap.S:2
> #1  0x0102b1d6 in __GI___mach_msg (msg=3D0x1315d10, option=3D3, send_size=
=3D64, rcv_size=3D32, rcv_name=3D0, timeout=3D0, notify=3D0) at msg.c:111
> #2  0x012c9850 in __gsync_wait (task=3D<optimized out>, addr=3D<optimized=
 out>, val1=3D<optimized out>, val2=3D<optimized out>, msec=3D<optimized ou=
t>, flags=3D<optimized out>) at ./build-tree/hurd-i386-libc/mach/RPC_gsync_=
wait.c:186
> #3  0x0104631b in __GI___spin_lock (__lock=3D0x12bb844 <_hurd_siglock>) a=
t ../mach/lock-intern.h:60
> #4  __GI___mutex_lock (__lock=3D0x12bb844 <_hurd_siglock>) at ../mach/loc=
k-intern.h:119
> #5  __GI__hurd_thread_sigstate (thread=3D<optimized out>) at hurdsig.c:80
> #6  0x0116abb8 in _hurd_critical_section_lock () at ../hurd/hurd/signal.h=
:230
> #7  _hurd_fd_get (fd=3D2) at ../hurd/hurd/fd.h:74
> #8  __GI___write_nocancel (fd=3D2, buf=3D0x1315e60, nbytes=3D<optimized o=
ut>) at ../sysdeps/mach/hurd/write_nocancel.c:26
> #9  0x01149135 in __GI___libc_write (fd=3D2, buf=3D0x1315e60, nbytes=3D41=
) at ../sysdeps/mach/hurd/write.c:26
> #10 0x0116ff07 in __GI___writev (fd=3D<optimized out>, vector=3D<optimize=
d out>, count=3D<optimized out>) at ../sysdeps/posix/writev.c:87
> #11 0x010b9df5 in writev_for_fatal (fd=3D<optimized out>, total=3D<optimi=
zed out>, niov=3D<optimized out>, iov=3D<optimized out>) at ../sysdeps/posi=
x/libc_fatal.c:44
> #12 __libc_message (fmt=3D<optimized out>) at ../sysdeps/posix/libc_fatal=
.c:124
> #13 0x010b9ead in __GI___libc_fatal (message=3D0x12216b4 "hurd: Can't add=
 reference on Mach thread\n") at ../sysdeps/posix/libc_fatal.c:159
> #14 0x01046524 in __GI__hurd_thread_sigstate (thread=3D<optimized out>) a=
t hurdsig.c:136
> #15 0x0103fd33 in __GI__hurd_self_sigstate () at ../hurd/hurd/signal.h:17=
3
> #16 _hurd_msgport_receive (arg=3D<error reading variable: Cannot access m=
emory at address 0x1316004>) at msgportdemux.c:47
> Backtrace stopped: Cannot access memory at address 0x1316000

So the immediate cause of the hang is we deadlock trying to take the
_hurd_siglock while already holding it (inside #14 0x01046524 in
__GI__hurd_thread_sigstate), since it's just a non-recursive struct
mutex. We should probably write our own version of writev_for_fatal
that does no locking (other than maybe trylock) and tries to not touch
any TLS or sigstate. Like, just grab the port from _hurd_dtable (with
no critical sections, no nothin') and call io_write on it (with the
regular non-intr mach_msg).

That, and abort () should be more careful with locking the sigstate
too, in order not to fault and/or deadlock if sigstate / TLS is
broken. Well, in this case we didn't even get to abort (), we
deadlocked on the write, but abort would be next.

But the underlying issue is that the thread port is bogus, which --
what? How can this even happen? This is not even some other thread's
port (maybe a thread could die and then its port right would turn into
a dead name, and then mach_port_mod_refs would return
KERN_INVALID_RIGHT if you try to add a send right...), this is our
very own thread port, the result of mach_thread_self () which was
called just several moments ago in hurd_self_sigstate ()!

If we're trying to tie this to TLS somehow, maybe the port is fine,
but the mach_port_mod_refs RPS fails because something is off with the
reply port. But also note that at this point we're well into libc.so
(this is the msgport thread already, and the other thread below is
already running user's code!), so clearly the TLS must have been set
up (and not by TLS_INIT_TP, this is not the main thread). And since we
managed to create the thread (which is done in libc.so, not ld.so),
TLS must have been OK in the main thread too (thread_create is not a
direct syscall).

And if TLS wasn't set up, I'd expect a TLS access to segfault or
busfault or something, not to read bogus data -- or am I wrong? what
does $gs_base initially point to, before it's set up? I would assume
it's either just 0x0, or the %gs:something read does not work at all
(SIGBUS).

> Interestingly, watching for the $gs update:
>
> =E2=82=AC gdb --args ./ld.so --library-path=3D/tmp ./test-as-const-rtld-s=
izes
> (gdb) b _start
> Breakpoint 1 at 0x1a5a0
> (gdb) r
> Starting program: /tmp/ld.so --library-path /tmp ./test-as-const-rtld-siz=
es
>
> Thread 5 hit Breakpoint 1, 0x0801a5a0 in _start ()
> (gdb) watch $gs
> Watchpoint 2: $gs
> (gdb) c
> Continuing.

Cool, I didn't know you could watch a register like that -- although
it appears to be super slow, so it must be not using hardware
watchpoints.

> At that point the library loading has happened:
>
> (gdb) info sharedlibrary
> From        To          Syms Read   Shared Object Library
> 0x08000db0  0x080256e1  Yes         /tmp/ld.so
> 0x0102a650  0x01200d35  No          /tmp/libc.so.0.3
> 0x012c49a0  0x012d0ad4  No          /tmp/libmachuser.so.1
> 0x012e0bc0  0x012fee50  No          /tmp/libhurduser.so.0.3
>
> And the function symbols indeed seem to have been overloaded:
>
> (gdb) l __write
> 384     __write (int fd, const void *buf, size_t nbytes)
> 385     {
> 386       error_t err;
> 387       vm_size_t nwrote;
> 388
> 389       assert (fd < _hurd_init_dtablesize);
>
>
> That is why I'm thinking that apparently exposing the libc functions
> happens before setting up TLS, and thus potential for mayhem if libc
> assumes that TLS is set up. The loading itself is apparently done in the
> _dl_map_object_deps call of dl_main.

Well, if that's why you're thinking libc.so functions are already in
use by ld.so, this will be easy to disprove :)

GDB is super cool, but it's not *that* smart. When you "l __write", it
likely just looks through the "loaded" DSOs and finds the symbol,
looks up its debuginfo, then source, and prints that. It may even say
that there are several places where a symbol with the same name is
defined. Here's what I get (on a different executable):

Thread 4 hit Temporary breakpoint 1, 0x080483f0 in main ()
(gdb) l __write
file: "../sysdeps/mach/hurd/dl-sysdep.c", line number: 389, symbol: "__writ=
e"
384     ../sysdeps/mach/hurd/dl-sysdep.c: No such file or directory.
file: "../sysdeps/mach/hurd/write.c", line number: 25, symbol:
"__GI___libc_write"
20      ../sysdeps/mach/hurd/write.c: No such file or directory.

Point is, GDB can look up symbols in DSOs, but it doesn't understand
symbol resolution rules: RTLD_LOCAL vs GLOBAL, linking namespaces,
whether a DSO has already been relocated or not, all of those things.

What you should really check is not what GDB prints on "l __write",
but rather what the GOT/PLT slots inside ld.so contain; that is where
ld.so will jump when it calls the functions. Here's an annotated GDB
session (with upstream Debian's glibc):

$ gdb -q ./hello
Reading symbols from ./hello...
# Start running so gdb can resolve addresses & symbols inside ld.so:
(gdb) starti
Starting program: /home/bugaevc/hello
Thread 4 stopped.
0x0001d550 in _start () from /lib/ld.so
# Let's look at the GOT/PLT entry for __write (actually __write_nocancel):
(gdb) p &'__write_nocancel@got.plt'
$1 =3D (<text from jump slot in .got.plt, no debug info> *) 0x36034
<__write_nocancel@got.plt>
(gdb) p '__write_nocancel@got.plt'
$2 =3D (<text from jump slot in .got.plt, no debug info>) 0xde6
# The entry itself is at 0x36034 (this will come in useful later), but
as of _start, it contains garbage (which is to be expected). Now let's
advance to TLS setup, and check again:
(gdb) advance __i386_set_gdt
__i386_set_gdt (target_thread=3D96, selector=3D0x1037b94, desc=3D...) at
./build-tree/hurd-i386-libc/mach/RPC_i386_set_gdt.c:79
79      ./build-tree/hurd-i386-libc/mach/RPC_i386_set_gdt.c: No such
file or directory.
(gdb) p '__write_nocancel@got.plt'
$3 =3D (<text from jump slot in .got.plt, no debug info>) 0x1c1f0 <__write>
# See, now the PLT entry points to __write. But whose __write this is?
(gdb) info symbol 0x1c1f0
__write_nocancel in section .text of /lib/ld.so
# It's ld.so's! That's because it has initially relocated itself so
that its symbols point back to itself.
# And now let's advance to where the signal thread is set up:
(gdb) tb _hurdsig_init
Function "_hurdsig_init" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Temporary breakpoint 1 (_hurdsig_init) pending.
(gdb) c
Continuing.
Thread 4 hit Temporary breakpoint 1, _hurdsig_init
(intarray=3D0x103c000, intarraysize=3D5) at ./hurd/hurdsig.c:1453
1453    ./hurd/hurdsig.c: No such file or directory.
# Sanity check: are we inside libc.so now?
(gdb) info symbol $eip
_hurdsig_init in section .text of /lib/i386-gnu/libc.so.0.3
# Yes we are, good. Let's look at the PLT entry:
(gdb) p '__write_nocancel@got.plt'
$4 =3D (<text from jump slot in .got.plt, no debug info>) 0x11b5820
<__GI___write_nocancel>
# Huh, it clearly points to a different __write_nocancel now! But
whose PLT entry are we looking at now, ld.so's or libc.so's? Who
knows, let's put in that address from above explicitly:
(gdb) p *(void**) 0x36034
$5 =3D (void *) 0x11b5820 <__GI___write_nocancel>
# And just to be very sure, this is libc.so's __write_nocancel, right?
(gdb) info symbol 0x11b5820
__write_nocancel in section .text of /lib/i386-gnu/libc.so.0.3
# Right :)

Hope I'm making my point clear: "l __write" is no basis to suspect
that ld.so has already been bound to libc.so function. Something else
must be going on.

I'll be back with more results, and thank you again.

Sergey