From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id 18275399E00F for ; Thu, 22 Jul 2021 12:12:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 18275399E00F Received: by mail-ej1-x632.google.com with SMTP id hd33so8028426ejc.9 for ; Thu, 22 Jul 2021 05:12:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=v7UgXvimVSLr+dKWli3yEDd0WSUAx6/649dwuJ2AmB4=; b=WgpCi//tXTSxNJNUXUBeb5g9wNLnB/Jvc94J4MYwmzOPONTy51zEN2jO18WgsuGArh HGaWBJWRFSMpfBOvgEfDSnRQsx36PFSSdR29PuqSf1eMKszUJsQ4elIUZhDQNZxi8fhM Y9RXhMSYa0rfqJThAQ++upVDAHenKJ5pny+X/DitF+uKQVLSxWi69QvhHh7/6Zij+cUR QE95rjZOXjuN2xB/DDHZB/O9kJavK1/SlzP7RCx6cpNorxiXdUkWktROK1gBeOzreegE 4nQSzQitWyOTUPHxk0CCB6Oc5euyO3Gjb2JQmSbeGmmQ6PNEEQJ7hQAyScudprRr9AxJ Lhkw== X-Gm-Message-State: AOAM531WszjuJdmiuJH0KBOQutHBctfvZ4FXNPX+dRo9Pz2wMXAh7yyO BsRmUCCzrW9qKLM+4VRq3TWNIDjUWYplpEt5rjI= X-Google-Smtp-Source: ABdhPJzE8oolrUdTVPizUzMWyzjiaIvOB4PqJEbyMk52LF6pJOEm7NG6l3S5IrChnNGe0OIVTlIy6pSRFwpetFONatk= X-Received: by 2002:a17:906:f9c5:: with SMTP id lj5mr43398181ejb.482.1626955938200; Thu, 22 Jul 2021 05:12:18 -0700 (PDT) MIME-Version: 1.0 References: <87im15qbp3.fsf@oldenburg.str.redhat.com> In-Reply-To: <87im15qbp3.fsf@oldenburg.str.redhat.com> From: Richard Biener Date: Thu, 22 Jul 2021 14:12:07 +0200 Message-ID: Subject: Re: Disabling TLS address caching to help QEMU on GNU/Linux To: Florian Weimer Cc: GCC Development , GNU C Library , qemu-devel@nongnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jul 2021 12:12:20 -0000 On Tue, Jul 20, 2021 at 4:54 PM Florian Weimer via Gcc wrote: > > Currently, the GNU/Linux ABI does not really specify whether the thread > pointer (the address of the TCB) may change at a function boundary. > > Traditionally, GCC assumes that the ABI allows caching addresses of > thread-local variables across function calls. Such caching varies in > aggressiveness between targets, probably due to differences in the > choice of -mtls-dialect=gnu and -mtls-dialect=gnu2 as the default for > the targets. (Caching with -mtls-dialect=gnu2 appears to be more > aggressive.) > > In addition to that, glibc defines errno as this: > > extern int *__errno_location (void) __attribute__ ((__const__)); > #define errno (*__errno_location ()) > > And the const attribute has the side effect of caching the address of > errno within the same stack frame. > > With stackful coroutines, such address caching is only valid if > coroutines are only ever resumed on the same thread on which they were > suspended. (The C++ coroutine implementation is not stackful and is not > affected by this at the ABI level.) Historically, I think we took the > position that cross-thread resumption is undefined. But the ABIs aren't > crystal-clear on this matter. > > One important piece of software for GNU is QEMU (not just for GNU/Linux, > Hurd development also benefits from virtualization). QEMU uses stackful > coroutines extensively. There are some hard-to-change code areas where > resumption happens across threads unfortunately. These increasingly > cause problems with more inlining, inter-procedural analysis, and a > general push towards LTO (which is also needed for some security > hardening features). > > Should the GNU toolchain offer something to help out the QEMU > developers? Maybe GCC could offer an option to disable the caching for > all TLS models. glibc could detect that mode based on a new > preprocessor macro and adjust its __errno_location declaration and > similar function declarations. There will be a performance impact of > this, of course, but it would make the QEMU usage well-defined (at the > lowest levels). But how does TLS usage transfer between threads? On the gimple level the TLS pointer is not visible and thus we'd happily CSE its address: __thread int x[2]; void bar (int *); int *foo(int i) { int *p = &x[i]; bar (p); return &x[i]; } results in int * foo (int i) { int * p; sizetype _5; sizetype _6; [local count: 1073741824]: _5 = (sizetype) i_1(D); _6 = _5 * 4; p_2 = &x + _6; bar (p_2); return p_2; } to make this work as expected one would need to expose the TLS pointer access. > If this is a programming model that should be supported, then restoring > some of the optimizations would be possible, by annotating > context-switching functions and TLS-address-dependent functions. But I > think QEMU would immediately benefit from just the simple approach that > disables address caching of TLS variables. > > Thanks, > Florian >