From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by sourceware.org (Postfix) with ESMTPS id A23363858028 for ; Mon, 12 Apr 2021 17:14:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A23363858028 Received: by mail-pf1-x42c.google.com with SMTP id m11so9602575pfc.11 for ; Mon, 12 Apr 2021 10:14:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ydRFGqWZoOMmHPzi9Kd+th3dU1QxHl+sPyXpHEV+QeQ=; b=P5xN2flps7HHvAX7s4WegjgjPtoVLG3PcdzHA2yhdSIYJI+VgJlYVsHPsgGvA2EMBA R5RME1QVLikq0TevWxRKvv363ENFoL39elgPeaNxYrTXmX4pZgCMcACbKZ79bjPDyeZM J167OwaFoE4DC6tEkzB/YDnE+U4ijqSQi8X9UyUiMvIt4ocwBergE2MkWj7bKMKbcSUS gC2zTIaHZFEmUiKo912JvxYjzAl9QtJM8RcTb2gW/vgniLoIF7fGxJc4iWza3509eqQq awv2cyEh/jIzVAlmWBlbDNo1/iwij7k6Qkq3KtJVE/fHmseBMFu3Ehouq0VyNTDfn8TU hlXw== X-Gm-Message-State: AOAM531Qv+KplMUvcKDo9jWDhrl5Q5IypGAaRESLXt+QGeF++Vbx+1J1 wIRf7jm5DnMwG5y1TQzEYIRgnw== X-Google-Smtp-Source: ABdhPJxtukySt/qUYDnP4g7urzG6Ivq4gpQx5CpMO1HXJYv5VO3SKUXUu0NMz2g1a5qQTTZv6iEICA== X-Received: by 2002:aa7:8d8a:0:b029:1f8:aa27:7203 with SMTP id i10-20020aa78d8a0000b02901f8aa277203mr25599322pfr.64.1618247661565; Mon, 12 Apr 2021 10:14:21 -0700 (PDT) Received: from google.com (240.111.247.35.bc.googleusercontent.com. [35.247.111.240]) by smtp.gmail.com with ESMTPSA id t19sm10604062pfg.38.2021.04.12.10.14.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 10:14:21 -0700 (PDT) Date: Mon, 12 Apr 2021 17:14:17 +0000 From: Sean Christopherson To: Len Brown Cc: Andy Lutomirski , David Laight , Dave Hansen , Greg KH , "Bae, Chang Seok" , X86 ML , LKML , libc-alpha , Florian Weimer , Rich Felker , Kyle Huey , Keno Fischer , Linux API Subject: Re: Candidate Linux ABI for Intel AMX and hypothetical new related features Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-18.2 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Apr 2021 17:14:24 -0000 On Sun, Apr 11, 2021, Len Brown wrote: > On Fri, Apr 9, 2021 at 5:44 PM Andy Lutomirski wrote: > > > > On Fri, Apr 9, 2021 at 1:53 PM Len Brown wrote: > > > > > > On Wed, Mar 31, 2021 at 6:45 PM Andy Lutomirski wrote: > > > > > > > > On Wed, Mar 31, 2021 at 3:28 PM Len Brown wrote: > > > > > We've also established that when running in a VMM, every update to > > > > > XCR0 causes a VMEXIT. > > > > > > > > This is true, it sucks, and Intel could fix it going forward. > > > > > > What hardware fix do you suggest? > > > If a guest is permitted to set XCR0 bits without notifying the VMM, > > > what happens when it sets bits that the VMM doesn't know about? > > > > The VM could have a mask of allowed XCR0 bits that don't exist. > > > > TDX solved this problem *somehow* -- XSETBV doesn't (visibly?) exit on > > TDX. Surely plain VMX could fix it too. > > There are two cases. > > 1. Hardware that exists today and in the foreseeable future. > > VM modification of XCR0 results in VMEXIT to VMM. > The VMM sees bits set by the guest, and so it can accept what > it supports, or send the VM a fault for non-support. > > Here it is not possible for the VMM to change XCR0 without the VMM knowing. > > 2. Future Hardware that allows guests to write XCR0 w/o VMEXIT. > > Not sure I follow your proposal. > > Yes, the VM effectively has a mask of what is supported, > because it can issue CPUID. > > The VMM virtualizes CPUID, and needs to know it must not > expose to the VM any state features it doesn't support. > Also, the VMM needs to audit XCR0 before it uses XSAVE, > else the guest could attack or crash the VMM through > buffer overrun. The VMM already needs to context switch XCR0 and XSS, so this is a non-issue. > Is this what you suggest? Yar. In TDX, XSETBV exits, but only to the TDX module. I.e. TDX solves the problem in software by letting the VMM tell the TDX module what features the guest can set in XCR0/XSS via the XFAM (Extended Features Allowed Mask). But, that software "fix" can also be pushed into ucode, e.g. add an XFAM VMCS field, the guest can set any XCR0 bits that are '1' in VMCS.XFAM without exiting. Note, SGX has similar functionality in the form of XFRM (XSAVE-Feature Request Mask). The enclave author can specify what features will be enabled in XCR0 when the enclave is running. Not that relevant, other than to reinforce that this is a solvable problem. > If yes, what do you suggest in the years between now and when > that future hardware and VMM exist? Burn some patch space? :-)