From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) by sourceware.org (Postfix) with ESMTPS id B60033857C59 for ; Thu, 20 May 2021 21:49:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B60033857C59 Received: by mail-ej1-f48.google.com with SMTP id p24so26308912ejb.1 for ; Thu, 20 May 2021 14:49:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HYS+jyPXn/IZkXpqRRQGfz2nyMwZ8AwOrIg/GoY6f6k=; b=EtCr5vsdDET4fGM9uc5RM4rO9TxTym0/t6hUoIkm63WiGNYg37iNv64B2NAmG7MmW+ 1QMV/UNHn3EWzLwB0pKOrnNIOgRAtdimCo3yewlJ8i8B3u4p/rLv+MjwKn1HB0kGgUsi lhBSA9SlGYME7KPC3SSDibd7x5ovGP/kr2AJatqQW/kgnnVxaTeyoHskGiZGfFOjSkN3 U3uLDOcjotUddqTZo4Txz5G0r43JaZgnj+dhmv9fpuJmiKLDC4nelcr1rLP0Q/+FBR1R 0hFiZ6Lj9+p84qPVWJ2nzOeeROUVauWaOKqIM5963F6A0id7jxBW/wq1ZURU8ImGR13i dLKg== X-Gm-Message-State: AOAM5311JIaaZ60Ta4AXKKjzjZOPRDt4t3eos5k/2aPFBcAKRRMC1MOb pbNshMsg69qA4w3C6awc0TEar1alvuZRzRDmbrE= X-Google-Smtp-Source: ABdhPJzsshM0w0yH5TT8of2ZM+vR9qHCD0DOiJ+QZ+08sPqmOuzJ2WGuRFQQxs5bxDdButcMlHpn8uuHqOBk7HcHqG4= X-Received: by 2002:a17:906:c299:: with SMTP id r25mr6858899ejz.501.1621547384808; Thu, 20 May 2021 14:49:44 -0700 (PDT) MIME-Version: 1.0 References: <20210415044258.GA6318@zn.tnic> <20210415052938.GA2325@1wt.eu> <20210415054713.GB6318@zn.tnic> <20210419141454.GE9093@zn.tnic> <20210419191539.GH9093@zn.tnic> <20210419215809.GJ9093@zn.tnic> <874kf11yoz.ffs@nanos.tec.linutronix.de> <87k0ntazyn.ffs@nanos.tec.linutronix.de> <87h7ixaxs9.ffs@nanos.tec.linutronix.de> In-Reply-To: <87h7ixaxs9.ffs@nanos.tec.linutronix.de> From: Len Brown Date: Thu, 20 May 2021 17:49:33 -0400 Message-ID: Subject: Re: Candidate Linux ABI for Intel AMX and hypothetical new related features To: Thomas Gleixner Cc: Borislav Petkov , Willy Tarreau , Andy Lutomirski , Florian Weimer , "Bae, Chang Seok" , Dave Hansen , X86 ML , LKML , Linux API , "libc-alpha@sourceware.org" , Rich Felker , Kyle Huey , Keno Fischer , Arjan van de Ven Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 May 2021 21:49:48 -0000 On Thu, May 20, 2021 at 5:41 PM Thomas Gleixner wrote: > > Len, > > On Thu, May 20 2021 at 17:22, Len Brown wrote: > > On Thu, May 20, 2021 at 4:54 PM Thomas Gleixner wrote: > >> > AMX is analogous to the multiplier used by AVX-512. > >> > The architectural state must exist on every CPU, including HT siblings. > >> > Today, the HT siblings share the same execution unit, > >> > and I have no reason to expect that will change. > >> > >> I'm well aware that HT siblings share the same execution unit for > >> AVX. > >> > >> Though AMX is if I remember the discussions two years ago correctly > >> shared by more than the HT siblings which makes things worse. > > > > I regret that we were unable to get together in the last year to have > > an updated discussion. I think if we had, then we would have saved > > a lot of mis-understanding and a lot of email! > > > > So let me emphasize here: > > > > There is one TMUL execution unit per core. > > It is shared by the HT siblings within that core. > > > > So the comparison to the AVX-512 multiplier is a good one. > > Fine, but that does not at all change the facts that: > > 1) It's shared between logical CPUs > > 2) It has effects on power/thermal and therefore effects which reach > outside of the core scope FWIW, this is true of *every* instruction in the CPU. Indeed, even when the CPU is executing *no* instructions at all, the C-state chosen by that CPU has power/thermal impacts on its peers. Granted, high performance instructions such as AVX-512 and TMUL are the most extreme case. > 3) Your approach of making it unconditionally available via the > proposed #NM prevents the OS and subsequently the system admin / > system designer to implement fine grained control over that > resource. > > And no, an opt-in approach by providing a non-mandatory > preallocation prctl does not solve that problem. I'm perfectly fine with making the explicit allocation (aka opt-in) mandatory, and enforcing it. Len Brown, Intel Open Source Technology Center