From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by sourceware.org (Postfix) with ESMTPS id 9FCCC3857809 for ; Fri, 16 Apr 2021 22:03:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9FCCC3857809 Received: by mail.kernel.org (Postfix) with ESMTPSA id B4B3D613CE for ; Fri, 16 Apr 2021 22:03:52 +0000 (UTC) Received: by mail-ed1-f51.google.com with SMTP id s15so33997641edd.4 for ; Fri, 16 Apr 2021 15:03:52 -0700 (PDT) X-Gm-Message-State: AOAM531tONLWqq6S2RrpZ0KqgbIH3eof0VuDNigTQsTAkSgmHrP1YYmE 4EQAiWkox+wGFe3q9Uai2zFNArpVffr6WRu/RMVyIw== X-Google-Smtp-Source: ABdhPJxzcpmSuHP1fOs78lM32cvcEAYNJEWCD3I2YBPaF37N+M3NMF/xQCYAwyCgMlVVCeuhnoZv51lzJUEsq5fuAsw= X-Received: by 2002:aa7:d7d1:: with SMTP id e17mr12745493eds.84.1618610631260; Fri, 16 Apr 2021 15:03:51 -0700 (PDT) MIME-Version: 1.0 References: <87lf9nk2ku.fsf@oldenburg.str.redhat.com> In-Reply-To: From: Andy Lutomirski Date: Fri, 16 Apr 2021 15:03:39 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Candidate Linux ABI for Intel AMX and hypothetical new related features To: Len Brown Cc: Andy Lutomirski , Willy Tarreau , Florian Weimer , "Bae, Chang Seok" , Dave Hansen , X86 ML , LKML , linux-abi@vger.kernel.org, "libc-alpha@sourceware.org" , Rich Felker , Kyle Huey , Keno Fischer Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Apr 2021 22:04:03 -0000 On Fri, Apr 16, 2021 at 2:54 PM Len Brown wrote: > > On Thu, Apr 15, 2021 at 12:24 PM Andy Lutomirski wrote: > > On Wed, Apr 14, 2021 at 2:48 PM Len Brown wrote: > > > > > ... the transition penalty into and out of AMX code > > The concept of 'transition' exists between AVX and SSE instructions > because it is possible to mix both instruction sets and touch different > parts of the same registers. The "unused" parts of those registers > need to be tracked to assure that data is not lost when mixing. I get it. That does not explain why LDMXCSR and VLDMXCSR cause pipelines stalls. > > This concept is moot with AMX, which has its own dedicated registers. > > > What is the actual impact of a trivial function that initializes the > > tile config, does one tiny math op, and then does TILERELEASE? ^^^^ "does one tiny math op" AVX-512 *also* has sort-of-dedicated registers: ZMM16 and up. I still can't find any conclusive evidence as to whether that avoids the performance hit. Intel's track record at actually explaining what operations cause what particular performance disasters is poor, and your explanation is not helping the situation. Sorry.