public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
@ 2024-02-11 13:59 fw at gcc dot gnu.org
  2024-02-11 14:03 ` [Bug target/113874] " fw at gcc dot gnu.org
                   ` (39 more replies)
  0 siblings, 40 replies; 41+ messages in thread
From: fw at gcc dot gnu.org @ 2024-02-11 13:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

            Bug ID: 113874
           Summary: GNU2 TLS descriptor calls do not follow psABI on
                    x86_64-linux-gnu
           Product: gcc
           Version: 13.2.1
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: fw at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-linux-gnu

Consider this test case:

struct tls {
  long a, b, c, d;
};

extern __thread struct tls tls_var __attribute__ ((visibility ("hidden")));

void
apply_tls (struct tls *p)
{
  tls_var = *p;
}

With “-O2 -fpic -mtls-dialect=gnu2“, it gets compiled to:

apply_tls:
.LFB0:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        movdqu  (%rdi), %xmm0
        leaq    tls_var@TLSDESC(%rip), %rax
        call    *tls_var@TLSCALL(%rax)
        addq    %fs:0, %rax
        movups  %xmm0, (%rax)
        movdqu  16(%rdi), %xmm1
        movups  %xmm1, 16(%rax)
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc

Note how %xmm0 is loaded before the descriptor call. The glibc implementation
assumes psABI, so %xmm0 is potentially clobbered by the call.

Discovered as a crash in the polymake testsuite
(/fan/objects/Geometry/PolyhedralFan/properties/Combinatorics/DUAL_GRAPH) if
its dependency nauty is compiled with -mtls-dialect=gnu2.

(i686-linux-gnu has the same issue with -msse2.)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
@ 2024-02-11 14:03 ` fw at gcc dot gnu.org
  2024-02-11 14:56 ` jakub at gcc dot gnu.org
                   ` (38 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: fw at gcc dot gnu.org @ 2024-02-11 14:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #1 from Florian Weimer <fw at gcc dot gnu.org> ---
Brought to the x86-64 ABI list:

GCC and the GNU2 TLS descriptor call ABI
<https://groups.google.com/g/x86-64-abi/c/NXQve2SPubc>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
  2024-02-11 14:03 ` [Bug target/113874] " fw at gcc dot gnu.org
@ 2024-02-11 14:56 ` jakub at gcc dot gnu.org
  2024-02-11 15:05 ` hjl.tools at gmail dot com
                   ` (37 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-11 14:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aoliva at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Alex, what was the original intention here?
It wouldn't surprise me if the intention was to clobber as few registers as
possible because in the common case the call doesn't really call much, just a
TLS load or so, and one can have hundreds of those in a single function.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
  2024-02-11 14:03 ` [Bug target/113874] " fw at gcc dot gnu.org
  2024-02-11 14:56 ` jakub at gcc dot gnu.org
@ 2024-02-11 15:05 ` hjl.tools at gmail dot com
  2024-02-11 17:57 ` hjl.tools at gmail dot com
                   ` (36 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-11 15:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 57385
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57385&action=edit
A patch

Try this.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-02-11 15:05 ` hjl.tools at gmail dot com
@ 2024-02-11 17:57 ` hjl.tools at gmail dot com
  2024-02-11 18:00 ` jakub at gcc dot gnu.org
                   ` (35 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-11 17:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2024-02-11

--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to H.J. Lu from comment #3)
> Created attachment 57385 [details]
> A patch
> 
> Try this.

This doesn't work properly.  To work around in ld.so, _dl_tlsdesc_dynamic needs
to save and restore ALL registers, which can be expensive.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-02-11 17:57 ` hjl.tools at gmail dot com
@ 2024-02-11 18:00 ` jakub at gcc dot gnu.org
  2024-02-11 18:37 ` fw at gcc dot gnu.org
                   ` (34 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-11 18:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #4)
> (In reply to H.J. Lu from comment #3)
> > Created attachment 57385 [details]
> > A patch
> > 
> > Try this.
> 
> This doesn't work properly.  To work around in ld.so, _dl_tlsdesc_dynamic
> needs
> to save and restore ALL registers, which can be expensive.

Or it could be compiled with options to make sure it doesn't use vector
registers etc., and only save/restore if it needs to call into some code where
libc can't afford that (say allocate memory).

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-02-11 18:00 ` jakub at gcc dot gnu.org
@ 2024-02-11 18:37 ` fw at gcc dot gnu.org
  2024-02-11 19:47 ` hjl.tools at gmail dot com
                   ` (33 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: fw at gcc dot gnu.org @ 2024-02-11 18:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #6 from Florian Weimer <fw at gcc dot gnu.org> ---
> (In reply to H.J. Lu from comment #4)
> > (In reply to H.J. Lu from comment #3)
> > > Created attachment 57385 [details]
> > > A patch
> > > 
> > > Try this.
> > 
> > This doesn't work properly.  To work around in ld.so, _dl_tlsdesc_dynamic
> > needs to save and restore ALL registers, which can be expensive.

Why doesn't this work properly? Is it possible to make it work with a different
approach?

The __tls_get_addr call with the default approach potentially needs to solve
the same problem, doesn't it?

(In reply to Jakub Jelinek from comment #5)
> Or it could be compiled with options to make sure it doesn't use vector
> registers etc., and only save/restore if it needs to call into some code
> where libc can't afford that (say allocate memory).

We currently call into malloc, which could be a replacement malloc. If GCC
cannot be fixed, full context switch or elimination of the slow path are our
best options for a glibc-side fix.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-02-11 18:37 ` fw at gcc dot gnu.org
@ 2024-02-11 19:47 ` hjl.tools at gmail dot com
  2024-02-11 21:05 ` jakub at gcc dot gnu.org
                   ` (32 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-11 19:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #7 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Florian Weimer from comment #6)
> > (In reply to H.J. Lu from comment #4)
> > > (In reply to H.J. Lu from comment #3)
> > > > Created attachment 57385 [details]
> > > > A patch
> > > > 
> > > > Try this.
> > > 
> > > This doesn't work properly.  To work around in ld.so, _dl_tlsdesc_dynamic
> > > needs to save and restore ALL registers, which can be expensive.
> 
> Why doesn't this work properly? Is it possible to make it work with a
> different approach?

Clobber must be attached to TLS descriptor call insn.

> The __tls_get_addr call with the default approach potentially needs to solve
> the same problem, doesn't it?

Isn't __tls_get_addr called via the PLT entry?

> (In reply to Jakub Jelinek from comment #5)
> > Or it could be compiled with options to make sure it doesn't use vector
> > registers etc., and only save/restore if it needs to call into some code
> > where libc can't afford that (say allocate memory).
> 
> We currently call into malloc, which could be a replacement malloc. If GCC
> cannot be fixed, full context switch or elimination of the slow path are our
> best options for a glibc-side fix.

We should open a glibc bug.  I am working on the glibc fix.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2024-02-11 19:47 ` hjl.tools at gmail dot com
@ 2024-02-11 21:05 ` jakub at gcc dot gnu.org
  2024-02-12  7:01 ` fw at gcc dot gnu.org
                   ` (31 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-11 21:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
E.g.
https://sourceware.org/legacy-ml/binutils/2005-09/msg00184.html
says
The functions defined above use custom calling conventions that
require them to preserve any registers they modify.  This penalizes
the case that requires dynamic TLS, since it must preserve all
call-clobbered registers before calling __tls_get_addr(), but it is
optimized for the most common case of static TLS, and also for the
case in which the code generated by the compiler can be relaxed by the
linker to a more efficient access model: being able to assume no
registers are clobbered by the call tends to improve register
allocation.  Also, the function that handles the dynamic TLS case will
most often be able to avoid calling __tls_get_addr(), thus potentially
avoiding the need for preserving registers.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2024-02-11 21:05 ` jakub at gcc dot gnu.org
@ 2024-02-12  7:01 ` fw at gcc dot gnu.org
  2024-02-12  8:53 ` rguenth at gcc dot gnu.org
                   ` (30 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: fw at gcc dot gnu.org @ 2024-02-12  7:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #9 from Florian Weimer <fw at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #7)
> > The __tls_get_addr call with the default approach potentially needs to solve
> > the same problem, doesn't it?
> 
> Isn't __tls_get_addr called via the PLT entry?

I'm not sure if that matters? Even if the lazy binding trampoline is active, it
won't protect the actual call.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2024-02-12  7:01 ` fw at gcc dot gnu.org
@ 2024-02-12  8:53 ` rguenth at gcc dot gnu.org
  2024-02-12 10:46 ` fw at gcc dot gnu.org
                   ` (29 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-12  8:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think a glibc fix would be very much preferred.  Is -mtls-dialect=gnu2
supposed to work on a per-TU base or are all parts of an executable + loaded
shlibs required to have the same setting?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2024-02-12  8:53 ` rguenth at gcc dot gnu.org
@ 2024-02-12 10:46 ` fw at gcc dot gnu.org
  2024-02-12 10:53 ` jakub at gcc dot gnu.org
                   ` (28 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: fw at gcc dot gnu.org @ 2024-02-12 10:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #11 from Florian Weimer <fw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #10)
> I think a glibc fix would be very much preferred.

It's a bit of a maintenance nightmare because we have to update the code
slightly each time new registers are added, and there isn't a good way for
applications to detect whether they run on a compatible glibc.

> Is -mtls-dialect=gnu2
> supposed to work on a per-TU base or are all parts of an executable + loaded
> shlibs required to have the same setting?

It's possible to link various TLS variants together, and they should
interoperate.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2024-02-12 10:46 ` fw at gcc dot gnu.org
@ 2024-02-12 10:53 ` jakub at gcc dot gnu.org
  2024-02-12 10:53 ` jakub at gcc dot gnu.org
                   ` (27 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-12 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Florian Weimer from comment #11)
> (In reply to Richard Biener from comment #10)
> > I think a glibc fix would be very much preferred.
> 
> It's a bit of a maintenance nightmare because we have to update the code
> slightly each time new registers are added, and there isn't a good way for
> applications to detect whether they run on a compatible glibc.

But it is what the ABI of GNU2 TLS says or what even dl-tlsdesc.C says:
        /* Preserve call-clobbered registers that we modify.
Yeah, the fact that it can call user-overloaded malloc significantly
complicates
stuff, otherwise it would be just a matter of new registers that can be
modified
while running whatever __tls_get_addr needs and could be changed only when
glibc is rebuilt with some newer compiler which starts modifying further call
clobbered registers.
But with overloaded malloc it can be just if the overloaded malloc is rebuilt
with newer compiler...

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2024-02-12 10:53 ` jakub at gcc dot gnu.org
@ 2024-02-12 10:53 ` jakub at gcc dot gnu.org
  2024-02-12 10:56 ` rguenth at gcc dot gnu.org
                   ` (26 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-12 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
BTW, isn't _mcount similar in this regard?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2024-02-12 10:53 ` jakub at gcc dot gnu.org
@ 2024-02-12 10:56 ` rguenth at gcc dot gnu.org
  2024-02-12 11:01 ` jakub at gcc dot gnu.org
                   ` (25 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-12 10:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at gcc dot gnu.org

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
True.  Maybe the kernel VDSO should have a _save_all_regs (fnptr) and
"indirector" ...

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2024-02-12 10:56 ` rguenth at gcc dot gnu.org
@ 2024-02-12 11:01 ` jakub at gcc dot gnu.org
  2024-02-12 11:30 ` rguenth at gcc dot gnu.org
                   ` (24 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-12 11:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Because right now it also means it needs to save/restore the APX registers
because malloc could be -mapxf compiled even when glibc isn't.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2024-02-12 11:01 ` jakub at gcc dot gnu.org
@ 2024-02-12 11:30 ` rguenth at gcc dot gnu.org
  2024-02-12 11:41 ` rguenth at gcc dot gnu.org
                   ` (23 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-12 11:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
I do wonder why __tls_get_addr would have to call the overloaded malloc, can
we just not force-bind it to the glibc local malloc (and make sure that's
compiled with -mgeneral-regs-only)?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2024-02-12 11:30 ` rguenth at gcc dot gnu.org
@ 2024-02-12 11:41 ` rguenth at gcc dot gnu.org
  2024-02-12 12:17 ` fw at gcc dot gnu.org
                   ` (22 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-12 11:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #16)
> I do wonder why __tls_get_addr would have to call the overloaded malloc, can
> we just not force-bind it to the glibc local malloc (and make sure that's
> compiled with -mgeneral-regs-only)?

I realize we end up calling memset (but __mempcpy?) as well, that might
end up in an ifunc and thus using non-general regs as well (and be
overloaded of course).  So the whole __tls_get_addr path would need to
make sure it never goes out of glibc controlled sources.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2024-02-12 11:41 ` rguenth at gcc dot gnu.org
@ 2024-02-12 12:17 ` fw at gcc dot gnu.org
  2024-02-12 12:24 ` hjl.tools at gmail dot com
                   ` (21 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: fw at gcc dot gnu.org @ 2024-02-12 12:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #18 from Florian Weimer <fw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #16)
> I do wonder why __tls_get_addr would have to call the overloaded malloc, can
> we just not force-bind it to the glibc local malloc (and make sure that's
> compiled with -mgeneral-regs-only)?

Using the glibc malloc just for some small TLS allocation is rather wasteful
because of its (mostly) per-thread data structures. Allocating from the main
arena potentially clashes with brk usage from the replacement malloc.

We'd need an alternative memory allocator (in addition to replacement string
functions), but that is known to break Thread Sanitizer and Leak Sanitizer.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2024-02-12 12:17 ` fw at gcc dot gnu.org
@ 2024-02-12 12:24 ` hjl.tools at gmail dot com
  2024-02-12 12:32 ` fw at gcc dot gnu.org
                   ` (20 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 12:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #19 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Florian Weimer from comment #9)
> (In reply to H.J. Lu from comment #7)
> > > The __tls_get_addr call with the default approach potentially needs to solve
> > > the same problem, doesn't it?
> > 
> > Isn't __tls_get_addr called via the PLT entry?
> 
> I'm not sure if that matters? Even if the lazy binding trampoline is active,
> it won't protect the actual call.

Non-GNU2 TLS has

0000000000004000  0000000100000007 R_X86_64_JUMP_SLOT     0000000000000000
__tls_get_addr + 1010

which calls _dl_runtime_resolve with lazy binding. _dl_runtime_resolve
preserves
all caller-saved registers.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2024-02-12 12:24 ` hjl.tools at gmail dot com
@ 2024-02-12 12:32 ` fw at gcc dot gnu.org
  2024-02-12 12:37 ` hjl.tools at gmail dot com
                   ` (19 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: fw at gcc dot gnu.org @ 2024-02-12 12:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #20 from Florian Weimer <fw at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #19)
> (In reply to Florian Weimer from comment #9)
> > (In reply to H.J. Lu from comment #7)
> > > > The __tls_get_addr call with the default approach potentially needs to solve
> > > > the same problem, doesn't it?
> > > 
> > > Isn't __tls_get_addr called via the PLT entry?
> > 
> > I'm not sure if that matters? Even if the lazy binding trampoline is active,
> > it won't protect the actual call.
> 
> Non-GNU2 TLS has
> 
> 0000000000004000  0000000100000007 R_X86_64_JUMP_SLOT     0000000000000000
> __tls_get_addr + 1010
> 
> which calls _dl_runtime_resolve with lazy binding. _dl_runtime_resolve
> preserves all caller-saved registers.

The dynamic linker preserves register contents during lazy binding and restores
them before calling __tls_get_addr, so it doesn't help with __tls_get_addr
register usage itself. And lazy binding happens only once per process and
object, while we need to protect the first call on every thread.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (19 preceding siblings ...)
  2024-02-12 12:32 ` fw at gcc dot gnu.org
@ 2024-02-12 12:37 ` hjl.tools at gmail dot com
  2024-02-12 14:27 ` jakub at gcc dot gnu.org
                   ` (18 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 12:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #21 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Florian Weimer from comment #20)
> (In reply to H.J. Lu from comment #19)
> > (In reply to Florian Weimer from comment #9)
> > > (In reply to H.J. Lu from comment #7)
> > > > > The __tls_get_addr call with the default approach potentially needs to solve
> > > > > the same problem, doesn't it?
> > > > 
> > > > Isn't __tls_get_addr called via the PLT entry?
> > > 
> > > I'm not sure if that matters? Even if the lazy binding trampoline is active,
> > > it won't protect the actual call.
> > 
> > Non-GNU2 TLS has
> > 
> > 0000000000004000  0000000100000007 R_X86_64_JUMP_SLOT     0000000000000000
> > __tls_get_addr + 1010
> > 
> > which calls _dl_runtime_resolve with lazy binding. _dl_runtime_resolve
> > preserves all caller-saved registers.
> 
> The dynamic linker preserves register contents during lazy binding and
> restores them before calling __tls_get_addr, so it doesn't help with
> __tls_get_addr register usage itself. And lazy binding happens only once per
> process and object, while we need to protect the first call on every thread.

Only called from _dl_tlsdesc_dynamic isn't protected.  My glibc patch:

https://patchwork.sourceware.org/project/glibc/list/?series=30800

fixes it.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (20 preceding siblings ...)
  2024-02-12 12:37 ` hjl.tools at gmail dot com
@ 2024-02-12 14:27 ` jakub at gcc dot gnu.org
  2024-02-12 14:41 ` hjl.tools at gmail dot com
                   ` (17 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-12 14:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #22 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
BTW, does aarch64 dl-tlsdesc.S save SVE/SME register state (I only see fixed
offsets in there), or are those call-saved?
What about floating point registers in x86_64/dl-tlsdesc.S?
And i386/dl-tlsdesc.S needs to save/restore 387 and SSE regs?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (21 preceding siblings ...)
  2024-02-12 14:27 ` jakub at gcc dot gnu.org
@ 2024-02-12 14:41 ` hjl.tools at gmail dot com
  2024-02-12 14:42 ` hjl.tools at gmail dot com
                   ` (16 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 14:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #23 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Jakub Jelinek from comment #22)
> BTW, does aarch64 dl-tlsdesc.S save SVE/SME register state (I only see fixed
> offsets in there), or are those call-saved?
> What about floating point registers in x86_64/dl-tlsdesc.S?

Floating point registers are preserved with my glibc patch.

> And i386/dl-tlsdesc.S needs to save/restore 387 and SSE regs?

i386 doesn't preserve them in _dl_runtime_resolve nor _dl_tlsdesc_dynamic.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (22 preceding siblings ...)
  2024-02-12 14:41 ` hjl.tools at gmail dot com
@ 2024-02-12 14:42 ` hjl.tools at gmail dot com
  2024-02-12 14:45 ` jakub at gcc dot gnu.org
                   ` (15 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 14:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |MOVED
             Status|NEW                         |RESOLVED

--- Comment #24 from H.J. Lu <hjl.tools at gmail dot com> ---
Moved to glibc.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (23 preceding siblings ...)
  2024-02-12 14:42 ` hjl.tools at gmail dot com
@ 2024-02-12 14:45 ` jakub at gcc dot gnu.org
  2024-02-12 16:53 ` hjl.tools at gmail dot com
                   ` (14 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-12 14:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #25 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #23)
> > And i386/dl-tlsdesc.S needs to save/restore 387 and SSE regs?
> 
> i386 doesn't preserve them in _dl_runtime_resolve nor _dl_tlsdesc_dynamic.

That is different.  _dl_runtime_resolve happens only at the start of calls to
functions, if in all supported ia32 ABIs all of i387 state is unsupported upon
entering functions, then there is no need to save anything.
While _dl_tlsdesc_dynamic can happen anywhere from within functions and doesn't
clobber any registers except ax which gets the value, so I think it needs to be
saved for that case.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (24 preceding siblings ...)
  2024-02-12 14:45 ` jakub at gcc dot gnu.org
@ 2024-02-12 16:53 ` hjl.tools at gmail dot com
  2024-02-12 16:59 ` jakub at gcc dot gnu.org
                   ` (13 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 16:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #26 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Jakub Jelinek from comment #25)
> (In reply to H.J. Lu from comment #23)
> > > And i386/dl-tlsdesc.S needs to save/restore 387 and SSE regs?
> > 
> > i386 doesn't preserve them in _dl_runtime_resolve nor _dl_tlsdesc_dynamic.
> 
> That is different.  _dl_runtime_resolve happens only at the start of calls
> to functions, if in all supported ia32 ABIs all of i387 state is unsupported
> upon entering functions, then there is no need to save anything.
> While _dl_tlsdesc_dynamic can happen anywhere from within functions and
> doesn't clobber any registers except ax which gets the value, so I think it
> needs to be saved for that case.

I couldn't find a test to show it is needed on i386:

#0  __GI___libc_malloc (bytes=3200) at malloc.c:3294
#1  0xf7fdb771 in malloc (size=<optimized out>) at ../include/rtld-malloc.h:56
#2  allocate_dtv_entry (size=<optimized out>, alignment=4) at dl-tls.c:679
#3  allocate_and_init (map=0xf6e00670) at dl-tls.c:704
#4  tls_get_addr_tail (ti=0xf6e00a30, dtv=0x5655fcd8, the_map=0xf6e00670)
    at dl-tls.c:904
#5  0xf7fdf5d5 in _dl_tlsdesc_dynamic () at ../sysdeps/i386/dl-tlsdesc.S:129
#6  0xf7fb017b in apply_tls (p=0xf7a0037c) at tst-gnu2-tls2mod1.c:26
#7  0x5655769b in access_mod (i=1, sym=0x5655a026 "apply_tls")
    at ../sysdeps/i386/i686/tst-gnu2-tls2-i686.c:55
#8  start (arg=0x0) at ../sysdeps/i386/i686/tst-gnu2-tls2-i686.c:70
#9  0xf7c96207 in start_thread (arg=<optimized out>) at pthread_create.c:447
#10 0xf7d3dc08 in clone3 () at ../sysdeps/unix/sysv/linux/i386/clone3.S:111

Even if I compile ia32 glibc with -march=skylake, the _dl_tlsdesc_dynamic slow
path doesn't touch XMM registers at all.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (25 preceding siblings ...)
  2024-02-12 16:53 ` hjl.tools at gmail dot com
@ 2024-02-12 16:59 ` jakub at gcc dot gnu.org
  2024-02-12 17:02 ` hjl.tools at gmail dot com
                   ` (12 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-12 16:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #27 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #26)
> Even if I compile ia32 glibc with -march=skylake, the _dl_tlsdesc_dynamic
> slow
> path doesn't touch XMM registers at all.

I thought Florian said it can call malloc and malloc can be user provided and
can use SSE2, 387/MMX or whatever other call clobbered registers ia32 has.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (26 preceding siblings ...)
  2024-02-12 16:59 ` jakub at gcc dot gnu.org
@ 2024-02-12 17:02 ` hjl.tools at gmail dot com
  2024-02-12 17:03 ` matz at gcc dot gnu.org
                   ` (11 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 17:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #28 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Jakub Jelinek from comment #27)
> (In reply to H.J. Lu from comment #26)
> > Even if I compile ia32 glibc with -march=skylake, the _dl_tlsdesc_dynamic
> > slow
> > path doesn't touch XMM registers at all.
> 
> I thought Florian said it can call malloc and malloc can be user provided
> and can use SSE2, 387/MMX or whatever other call clobbered registers ia32
> has.

[hjl@gnu-cfl-3 elf]$ readelf -rW ld.so

Relocation section '.rel.dyn' at offset 0x9f8 contains 3 entries:
 Offset     Info    Type                Sym. Value  Symbol's Name
00032fe0  00001a06 R_386_GLOB_DAT         00031ac0   __rseq_offset@@GLIBC_2.35
00032fe4  00001f06 R_386_GLOB_DAT         00031ac4   __rseq_size@@GLIBC_2.35
00032b20  0000002a R_386_IRELATIVE       

Relocation section '.relr.dyn' at offset 0xa10 contains 3 entries:
  12 offsets
00031a60
00032ed0
00032ed8
00032f04
00032f08
00032f0c
00032f10
00032f14
00032f18
00032f1c
00032f20
00032f24
[hjl@gnu-cfl-3 elf]$ 

You can't use another malloc for the ld.so internal usage of malloc/calloc.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (27 preceding siblings ...)
  2024-02-12 17:02 ` hjl.tools at gmail dot com
@ 2024-02-12 17:03 ` matz at gcc dot gnu.org
  2024-02-12 17:08 ` hjl.tools at gmail dot com
                   ` (10 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: matz at gcc dot gnu.org @ 2024-02-12 17:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #29 from Michael Matz <matz at gcc dot gnu.org> ---
It not only can call malloc.  As the backtrace of H.J. shows, it quite clearly
_does_ so :-)

That's why there is talk earlier in this report about potentially not using
malloc as one-time allocator for thread-local areas at all, or allocate the
memory at a different time that from __tls_get_addr.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (28 preceding siblings ...)
  2024-02-12 17:03 ` matz at gcc dot gnu.org
@ 2024-02-12 17:08 ` hjl.tools at gmail dot com
  2024-02-12 17:13 ` matz at gcc dot gnu.org
                   ` (9 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 17:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #30 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Michael Matz from comment #29)
> It not only can call malloc.  As the backtrace of H.J. shows, it quite
> clearly _does_ so :-)
> 

ld.so can only call the malloc implementation internal to ld.so.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (29 preceding siblings ...)
  2024-02-12 17:08 ` hjl.tools at gmail dot com
@ 2024-02-12 17:13 ` matz at gcc dot gnu.org
  2024-02-12 17:15 ` hjl.tools at gmail dot com
                   ` (8 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: matz at gcc dot gnu.org @ 2024-02-12 17:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #31 from Michael Matz <matz at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #30)
> (In reply to Michael Matz from comment #29)
> > It not only can call malloc.  As the backtrace of H.J. shows, it quite
> > clearly _does_ so :-)
> 
> ld.so can only call the malloc implementation internal to ld.so.

(And string functions for initializing that memory)  If that's ensured already
everywhere: super.  Because I agree, that this is the best thing to do here.
From my perspective this is pure internal implementation details and hence
setting up thread-local areas should not be expected to be interposable by
users.
(a custom allocator that isn't malloc or doesn't interact with it also would
work)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (30 preceding siblings ...)
  2024-02-12 17:13 ` matz at gcc dot gnu.org
@ 2024-02-12 17:15 ` hjl.tools at gmail dot com
  2024-02-12 17:16 ` hjl.tools at gmail dot com
                   ` (7 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 17:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #32 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Michael Matz from comment #31)
> (In reply to H.J. Lu from comment #30)
> > (In reply to Michael Matz from comment #29)
> > > It not only can call malloc.  As the backtrace of H.J. shows, it quite
> > > clearly _does_ so :-)
> > 
> > ld.so can only call the malloc implementation internal to ld.so.
> 
> (And string functions for initializing that memory)  If that's ensured
> already
> everywhere: super.  Because I agree, that this is the best thing to do here.
> From my perspective this is pure internal implementation details and hence
> setting up thread-local areas should not be expected to be interposable by
> users.
> (a custom allocator that isn't malloc or doesn't interact with it also would
> work)

Since ia32 ld.so in glibc is compiled with:

Makefile:rtld-CFLAGS += -mno-sse -mno-mmx -mfpmath=387

ia32 _dl_tlsdesc_dynamic is OK.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (31 preceding siblings ...)
  2024-02-12 17:15 ` hjl.tools at gmail dot com
@ 2024-02-12 17:16 ` hjl.tools at gmail dot com
  2024-02-12 17:17 ` hjl.tools at gmail dot com
                   ` (6 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 17:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #33 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to H.J. Lu from comment #32)
> (In reply to Michael Matz from comment #31)
> > (In reply to H.J. Lu from comment #30)
> > > (In reply to Michael Matz from comment #29)
> > > > It not only can call malloc.  As the backtrace of H.J. shows, it quite
> > > > clearly _does_ so :-)
> > > 
> > > ld.so can only call the malloc implementation internal to ld.so.
> > 
> > (And string functions for initializing that memory)  If that's ensured
> > already
> > everywhere: super.  Because I agree, that this is the best thing to do here.
> > From my perspective this is pure internal implementation details and hence
> > setting up thread-local areas should not be expected to be interposable by
> > users.
> > (a custom allocator that isn't malloc or doesn't interact with it also would
> > work)
> 
> Since ia32 ld.so in glibc is compiled with:
> 
> Makefile:rtld-CFLAGS += -mno-sse -mno-mmx -mfpmath=387
> 
> ia32 _dl_tlsdesc_dynamic is OK.

387 registers may be an issue.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (32 preceding siblings ...)
  2024-02-12 17:16 ` hjl.tools at gmail dot com
@ 2024-02-12 17:17 ` hjl.tools at gmail dot com
  2024-02-12 17:21 ` schwab@linux-m68k.org
                   ` (5 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 17:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #34 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to H.J. Lu from comment #33)
> (In reply to H.J. Lu from comment #32)
> > (In reply to Michael Matz from comment #31)
> > > (In reply to H.J. Lu from comment #30)
> > > > (In reply to Michael Matz from comment #29)
> > > > > It not only can call malloc.  As the backtrace of H.J. shows, it quite
> > > > > clearly _does_ so :-)
> > > > 
> > > > ld.so can only call the malloc implementation internal to ld.so.
> > > 
> > > (And string functions for initializing that memory)  If that's ensured
> > > already
> > > everywhere: super.  Because I agree, that this is the best thing to do here.
> > > From my perspective this is pure internal implementation details and hence
> > > setting up thread-local areas should not be expected to be interposable by
> > > users.
> > > (a custom allocator that isn't malloc or doesn't interact with it also would
> > > work)
> > 
> > Since ia32 ld.so in glibc is compiled with:
> > 
> > Makefile:rtld-CFLAGS += -mno-sse -mno-mmx -mfpmath=387
> > 
> > ia32 _dl_tlsdesc_dynamic is OK.
> 
> 387 registers may be an issue.

I checked ld.so.  It doesn't use 387 registers.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (33 preceding siblings ...)
  2024-02-12 17:17 ` hjl.tools at gmail dot com
@ 2024-02-12 17:21 ` schwab@linux-m68k.org
  2024-02-12 17:30 ` hjl.tools at gmail dot com
                   ` (4 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: schwab@linux-m68k.org @ 2024-02-12 17:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #35 from Andreas Schwab <schwab@linux-m68k.org> ---
ld.so use its internal malloc only during bootstrapping.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (34 preceding siblings ...)
  2024-02-12 17:21 ` schwab@linux-m68k.org
@ 2024-02-12 17:30 ` hjl.tools at gmail dot com
  2024-02-12 17:38 ` schwab@linux-m68k.org
                   ` (3 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-12 17:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #36 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Andreas Schwab from comment #35)
> ld.so use its internal malloc only during bootstrapping.

___tls_get_addr always uses the internal malloc.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (35 preceding siblings ...)
  2024-02-12 17:30 ` hjl.tools at gmail dot com
@ 2024-02-12 17:38 ` schwab@linux-m68k.org
  2024-02-13  4:19 ` hjl.tools at gmail dot com
                   ` (2 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: schwab@linux-m68k.org @ 2024-02-12 17:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #37 from Andreas Schwab <schwab@linux-m68k.org> ---
No, it uses whatever __rtld_malloc points at, which will be the normal malloc
after bootstrap.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (36 preceding siblings ...)
  2024-02-12 17:38 ` schwab@linux-m68k.org
@ 2024-02-13  4:19 ` hjl.tools at gmail dot com
  2024-02-13  8:34 ` rguenth at gcc dot gnu.org
  2024-02-13  9:00 ` nsz at gcc dot gnu.org
  39 siblings, 0 replies; 41+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-13  4:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #38 from H.J. Lu <hjl.tools at gmail dot com> ---
The new glibc patch set covers both i386 and x86-64:

https://patchwork.sourceware.org/project/glibc/list/?series=30854

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (37 preceding siblings ...)
  2024-02-13  4:19 ` hjl.tools at gmail dot com
@ 2024-02-13  8:34 ` rguenth at gcc dot gnu.org
  2024-02-13  9:00 ` nsz at gcc dot gnu.org
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-13  8:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

--- Comment #39 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #32)
> (In reply to Michael Matz from comment #31)
> > (In reply to H.J. Lu from comment #30)
> > > (In reply to Michael Matz from comment #29)
> > > > It not only can call malloc.  As the backtrace of H.J. shows, it quite
> > > > clearly _does_ so :-)
> > > 
> > > ld.so can only call the malloc implementation internal to ld.so.
> > 
> > (And string functions for initializing that memory)  If that's ensured
> > already
> > everywhere: super.  Because I agree, that this is the best thing to do here.
> > From my perspective this is pure internal implementation details and hence
> > setting up thread-local areas should not be expected to be interposable by
> > users.
> > (a custom allocator that isn't malloc or doesn't interact with it also would
> > work)
> 
> Since ia32 ld.so in glibc is compiled with:
> 
> Makefile:rtld-CFLAGS += -mno-sse -mno-mmx -mfpmath=387
> 
> ia32 _dl_tlsdesc_dynamic is OK.

Maybe also use -minline-all-stringops to avoid using IFUNC accelerated
memset/memcpy?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
  2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
                   ` (38 preceding siblings ...)
  2024-02-13  8:34 ` rguenth at gcc dot gnu.org
@ 2024-02-13  9:00 ` nsz at gcc dot gnu.org
  39 siblings, 0 replies; 41+ messages in thread
From: nsz at gcc dot gnu.org @ 2024-02-13  9:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874

nsz at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nsz at gcc dot gnu.org

--- Comment #40 from nsz at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #22)
> BTW, does aarch64 dl-tlsdesc.S save SVE/SME register state (I only see fixed
> offsets in there), or are those call-saved?

call-saved.

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2024-02-13  9:00 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-11 13:59 [Bug target/113874] New: GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu fw at gcc dot gnu.org
2024-02-11 14:03 ` [Bug target/113874] " fw at gcc dot gnu.org
2024-02-11 14:56 ` jakub at gcc dot gnu.org
2024-02-11 15:05 ` hjl.tools at gmail dot com
2024-02-11 17:57 ` hjl.tools at gmail dot com
2024-02-11 18:00 ` jakub at gcc dot gnu.org
2024-02-11 18:37 ` fw at gcc dot gnu.org
2024-02-11 19:47 ` hjl.tools at gmail dot com
2024-02-11 21:05 ` jakub at gcc dot gnu.org
2024-02-12  7:01 ` fw at gcc dot gnu.org
2024-02-12  8:53 ` rguenth at gcc dot gnu.org
2024-02-12 10:46 ` fw at gcc dot gnu.org
2024-02-12 10:53 ` jakub at gcc dot gnu.org
2024-02-12 10:53 ` jakub at gcc dot gnu.org
2024-02-12 10:56 ` rguenth at gcc dot gnu.org
2024-02-12 11:01 ` jakub at gcc dot gnu.org
2024-02-12 11:30 ` rguenth at gcc dot gnu.org
2024-02-12 11:41 ` rguenth at gcc dot gnu.org
2024-02-12 12:17 ` fw at gcc dot gnu.org
2024-02-12 12:24 ` hjl.tools at gmail dot com
2024-02-12 12:32 ` fw at gcc dot gnu.org
2024-02-12 12:37 ` hjl.tools at gmail dot com
2024-02-12 14:27 ` jakub at gcc dot gnu.org
2024-02-12 14:41 ` hjl.tools at gmail dot com
2024-02-12 14:42 ` hjl.tools at gmail dot com
2024-02-12 14:45 ` jakub at gcc dot gnu.org
2024-02-12 16:53 ` hjl.tools at gmail dot com
2024-02-12 16:59 ` jakub at gcc dot gnu.org
2024-02-12 17:02 ` hjl.tools at gmail dot com
2024-02-12 17:03 ` matz at gcc dot gnu.org
2024-02-12 17:08 ` hjl.tools at gmail dot com
2024-02-12 17:13 ` matz at gcc dot gnu.org
2024-02-12 17:15 ` hjl.tools at gmail dot com
2024-02-12 17:16 ` hjl.tools at gmail dot com
2024-02-12 17:17 ` hjl.tools at gmail dot com
2024-02-12 17:21 ` schwab@linux-m68k.org
2024-02-12 17:30 ` hjl.tools at gmail dot com
2024-02-12 17:38 ` schwab@linux-m68k.org
2024-02-13  4:19 ` hjl.tools at gmail dot com
2024-02-13  8:34 ` rguenth at gcc dot gnu.org
2024-02-13  9:00 ` nsz at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).