From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1135.google.com (mail-yw1-x1135.google.com [IPv6:2607:f8b0:4864:20::1135]) by sourceware.org (Postfix) with ESMTPS id 37F0A3858C52 for ; Mon, 12 Feb 2024 12:35:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 37F0A3858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 37F0A3858C52 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1135 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707741302; cv=none; b=Alf2Y0I0l8hxyS7XNDN+EKP+P4ITh2OS35Aw0oXZuJytC4TI9iDe3CKFfGDJq+IHwNPCSX3UzlfKiLHLlfjffSwI0D0z+X9TAzWvSrdHcJTva0eqjDeG8O0CEiGv+AwBOgk4S/dBgANa/wC8sC1EACGDwEJ8OAKe7F0bTlLwrqc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707741302; c=relaxed/simple; bh=uTPSdLg3hS0LtEwwAhuEAduwiDSyM1swNL/oBg5QKt4=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=GVtuYiAcac8WaFUAr2WOM3ptTFHts9zFySnV5MpZ7m31+aGpIEV45hiHdO4erZmJbZv4pxH5pJUq/ECVCFcdCGKlGimJaLc03bW/9rM0kA0CBmtQ0+CWfEVkafUgPCxKFFSxjbsLrzIuScqkuZqbe5sXZys7aajRy73WDjuRq48= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x1135.google.com with SMTP id 00721157ae682-6047a616bfeso27340427b3.3 for ; Mon, 12 Feb 2024 04:35:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707741299; x=1708346099; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fyudw9ddaWWfaer7KWKQYILxkdy1kr5c8nMiR32jHVY=; b=VX8QJSV19fJpnRo3vQ4blltx9VboWQknP8lXEJNMmRWOEURLJXR5+JeQ7hZhuiHht3 gc/LIdFe0OMNUZlIyYUaPH97NwOLARuEboegV+ve9DaeYqlEecScDNMg5y7dDOYvHVR1 jgDSIh7xAKkzmyFHw+QDDnL3SDoQ7/UqxEEGobocWY8BuDjY/nFm9u3hRLVr7QKKjNHr N6nfa6I13712DYj1lxaZK1XnsSzaqSOndSx99vn0S15isRDeu4V1MnaePwdaqAeXIuYj 7ZekCGYBvpSGvoa1jj84izKtiodwvUTDdis3MjC1OEE1w65W6O+HmNMRbxiQNgeE8Dqc iVZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707741299; x=1708346099; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fyudw9ddaWWfaer7KWKQYILxkdy1kr5c8nMiR32jHVY=; b=ncvov3BVX6bIiToASetqDMKcHAJ2Po1n1ZPUqKQl7bM2NZEg5DcA6ZhIb832eyY+Z9 hxIwYN6XmMz+TErYrdmnHTAgnSCOOpK48C3XKnT7FWxEM1wM3cdOavMElUzYSTjMxzSW gvyxIgk/XO8whRS3tYuFt3dyUh0MeWnOTFDB2mzRjnHZ9D74oWbGg1HZmqApp+FsknmK QrjDh52P0Re73dbGEwQzkJb9+EguuPsiTVc8QGz9UcHV9/lLvq45sFbdRzmXL8V5+kR7 VuNWMewyxUzRoBWYRlQO7f6emjjstim2e7Fh86C/y3FkgyKZ0CP2Bl+aEYiccIzvhCUi sXlg== X-Gm-Message-State: AOJu0Yx6SiZPRX2Wu0NDORG0sozZ1PnuoVXsCs6I6XnILysfxw3B5dge Et3RIa9r4K7n1RZgHe0uTfnzxGhZYXM90194KpllBm6vteR7KBnGdcaTCzDtaX37ch6XZmNZFvN crFwRZSKwchYcW7G5S6wRIFZLzk8= X-Google-Smtp-Source: AGHT+IELVQmC+8k41NUrFTqFyra1zh0CxPUXxkoYGLNN/Li4C4yQEiCwqBE4Vyt+Xyasm+oci21oO3b8oUiCNCeIbWQ= X-Received: by 2002:a0d:e543:0:b0:604:9729:6b56 with SMTP id o64-20020a0de543000000b0060497296b56mr4721253ywe.29.1707741299059; Mon, 12 Feb 2024 04:34:59 -0800 (PST) MIME-Version: 1.0 References: <20240212031949.3041730-1-hjl.tools@gmail.com> <87eddic680.fsf@oldenburg3.str.redhat.com> In-Reply-To: <87eddic680.fsf@oldenburg3.str.redhat.com> From: "H.J. Lu" Date: Mon, 12 Feb 2024 04:34:22 -0800 Message-ID: Subject: Re: [PATCH v2] x86-64: Update _dl_tlsdesc_dynamic to preserve vector registers To: Florian Weimer Cc: libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3014.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Feb 12, 2024 at 2:44=E2=80=AFAM Florian Weimer = wrote: > > * H. J. Lu: > > > Compiler generates the following instruction sequence for GNU2 dynamic > > TLS access: > > > > leaq tls_var@TLSDESC(%rip), %rax > > call *tls_var@TLSCALL(%rax) > > > > CALL instruction may be transparent to compiler which assumes all > > registers, except for RAX, are unchanged after CALL. At run-time, > > _dl_tlsdesc_dynamic is called, which calls __tls_get_addr on the > > slow path. __tls_get_addr is a normal function which doesn't > > preserve any caller-saved registers. _dl_tlsdesc_dynamic saves and > > restores integer caller-saved registers, but doesn't preserve any > > vector registers which are caller-saved. Add _dl_tlsdesc_dynamic > > IFUNC functions for FXSAVE, XSAVE and XSAVEC to save and restore > > all vector registers. This fixes BZ #31372. > > What about the flags register? It's still clobbered. (define_insn "*tls_dynamic_gnu2_call_64_" [(set (match_operand:PTR 0 "register_operand" "=3Da") (unspec:PTR [(match_operand 1 "tls_symbolic_operand") (match_operand:PTR 2 "register_operand" "0") (reg:PTR SP_REG)] UNSPEC_TLSDESC)) (clobber (reg:CC FLAGS_REG))] <<<<<<< clobbered "TARGET_64BIT && TARGET_GNU2_TLS" "call\t{*%a1@TLSCALL(%2)|[QWORD PTR [%2+%a1@TLSCALL]]}" [(set_attr "type" "call") (set_attr "length" "2") (set_attr "length_address" "0")]) > What about stack pointer alignment? Fixed by https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D58066 My glibc patch assumes that the stack is aligned to 8 bytes. But it only affects the FXSAVE IFUNC. XSAVE IFUNC requires 64-byte alignment anyway. > > + /* Besides rdi and rsi, saved above, save rcx, rdx, r8, r9, > > + r10 and r11. */ > > I would prefer to see some explicit ABI action first before making such It is explicitly documented: https://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt --- The functions defined above use custom calling conventions that require them to preserve any registers they modify. This penalizes the case that requires dynamic TLS, since it must preserve (*) all call-clobbered registers before calling __tls_get_addr(), but it is optimized for the most common case of static TLS, and also for the case in which the code generated by the compiler can be relaxed by the linker to a more efficient access model: being able to assume no registers are clobbered by the call tends to improve register allocation. Also, the function that handles the dynamic TLS case will most often be able to avoid calling __tls_get_addr(), thus potentially avoiding the need for preserving registers. --- Glibc simply doesn't preserve vector registers. > I would prefer to see some explicit ABI action first before making such > decisions. Maybe %r11 could be a scratch register available for use by %r11 is caller-saved, not available. > the TLSDESC call? Or does current GCC already assume that all scalar > registers are preserved? GCC assumes that only FLAGS_REG is reserved and RAX is the destination register. Everything else must be preserved. > The other way to fix this is to preallocate everything, so that the need > for the slow path goes away. True. --=20 H.J.