From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by sourceware.org (Postfix) with ESMTPS id 22A4F3857718 for ; Wed, 5 Apr 2023 21:05:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 22A4F3857718 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x530.google.com with SMTP id w9so144391114edc.3 for ; Wed, 05 Apr 2023 14:05:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680728757; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gUlgRMY/CUpclgM7DJyEWO2Wr+LG+diMc/P/1qZHxJk=; b=jzHDm78GKyAmbNWb4fsil0VKOYOcHhSCM5vICQJXE/3NL2rIIg9if2d1XKlcCiq1Ed qiwWaKNV9mFA6PbHM/dcj0432zqlqJRC4KiRn3mNpQzlKKTMbvtk665aQA2311hKcFY8 uyD03WcM5AaTpKGZvL5L1yHnxW9ztQZfqwwWrN45vytjTHSqioQlPJTL3+y1ul29Ip+c mRoTLnonf3x4un4P0bLQezKa2f84/FSvytaFOwdPVsT5gb6D/bvDxBq2pMMgIWEP9ZgS P91mEVB3cMF3ckt8L5CXQn9SMAhYZY62DCN6/ZpZnZF+syUdbugW4SXmDyRBTICCg01Y NgCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680728757; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gUlgRMY/CUpclgM7DJyEWO2Wr+LG+diMc/P/1qZHxJk=; b=8HKtIFE+eba0MjufvhKMoSj6Jt3BMDXIi2l7XyEPYZSrhTGNLWzKCR1MN4GHUilupa h254mdmWKCRbBOZil+kT58q6dtQ7ElrUplUvpR7Pw3aaWCCVn0PRjlsvlospF42nSGz8 Y3y+QMPX7RroyDG/S/W5V1dlHvjIo4roM98WJrPHCZSyH/Up37burKwEpfH/naBGh2TQ 1mEn96SR/wfx6h9iHtzADDPxmdEA98Q1pfvAtx1djQIaKU6eph3HkSi4RJLHUBlbeHEU T1MzgNlUrC1PocG93tRliFLYfuJ7e9U5EM01vq2F9cnv89ioU/Fo2f9pJOMoA31aTsvN vJoA== X-Gm-Message-State: AAQBX9eOOKqRoUTiKrW5tO8ZRCEaV0b5H1sinKaqiUb6dpx3JMyfSmxe SN57K7+OCjoy6ZOm2ZcfUFt6Sk3kIxftLdIkHWiD0MliWZY= X-Google-Smtp-Source: AKy350Z3y7Z4Ajf5qmqY9RbrnwyqiKl31u2Kyb3EJ3J/KjcdGFvb+bca3m8v16baAkOVf9gRnieDvDAzvPVUjxiY1VA= X-Received: by 2002:a17:906:4357:b0:931:6f5b:d27d with SMTP id z23-20020a170906435700b009316f5bd27dmr2291128ejm.0.1680728757542; Wed, 05 Apr 2023 14:05:57 -0700 (PDT) MIME-Version: 1.0 References: <20230405162144.984598-1-hjl.tools@gmail.com> <20230405162144.984598-19-hjl.tools@gmail.com> In-Reply-To: <20230405162144.984598-19-hjl.tools@gmail.com> From: Noah Goldstein Date: Wed, 5 Apr 2023 16:05:46 -0500 Message-ID: Subject: Re: [PATCH 18/19] : Add AMX-COMPLEX support To: "H.J. Lu" Cc: libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Apr 5, 2023 at 11:27=E2=80=AFAM H.J. Lu via Libc-alpha wrote: > > Add AMX-COMPLEX support to . > --- > manual/platform.texi | 3 +++ > sysdeps/x86/bits/platform/x86.h | 1 + > sysdeps/x86/cpu-features.c | 2 ++ > sysdeps/x86/include/cpu-features.h | 3 +++ > sysdeps/x86/tst-get-cpu-features.c | 2 ++ > 5 files changed, 11 insertions(+) > > diff --git a/manual/platform.texi b/manual/platform.texi > index 1e120993d7..e7448ffc1a 100644 > --- a/manual/platform.texi > +++ b/manual/platform.texi > @@ -197,6 +197,9 @@ The supported processor features are: > @item > @code{AMX_BF16} -- Tile computational operations on bfloat16 numbers. > > +@item > +@code{AMX_COMPLEX} -- Tile computational operations on complex FP16 numb= ers. > + > @item > @code{AMX_INT8} -- Tile computational operations on 8-bit numbers. > > diff --git a/sysdeps/x86/bits/platform/x86.h b/sysdeps/x86/bits/platform/= x86.h > index d8ba33bd42..96eb4c070d 100644 > --- a/sysdeps/x86/bits/platform/x86.h > +++ b/sysdeps/x86/bits/platform/x86.h > @@ -310,6 +310,7 @@ enum > > x86_cpu_AVX_VNNI_INT8 =3D x86_cpu_index_7_ecx_1_edx + 4= , > x86_cpu_AVX_NE_CONVERT =3D x86_cpu_index_7_ecx_1_edx + 5, > + x86_cpu_AMX_COMPLEX =3D x86_cpu_index_7_ecx_1_edx + 8, > > x86_cpu_index_19_ebx > =3D (CPUID_INDEX_19 * 8 * 4 * sizeof (unsigned int) > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > index dfd1b85dce..c2bea6a32d 100644 > --- a/sysdeps/x86/cpu-features.c > +++ b/sysdeps/x86/cpu-features.c > @@ -221,6 +221,8 @@ update_active (struct cpu_features *cpu_features) > CPU_FEATURE_SET_ACTIVE (cpu_features, AMX_INT8); > /* Determine if AMX_FP16 is usable. */ > CPU_FEATURE_SET_ACTIVE (cpu_features, AMX_FP16); > + /* Determine if AMX_COMPLEX is usable. */ > + CPU_FEATURE_SET_ACTIVE (cpu_features, AMX_COMPLEX); > } > > /* These features are usable only when OSXSAVE is enabled. */ > diff --git a/sysdeps/x86/include/cpu-features.h b/sysdeps/x86/include/cpu= -features.h > index 673cf8ca92..f14c1078d5 100644 > --- a/sysdeps/x86/include/cpu-features.h > +++ b/sysdeps/x86/include/cpu-features.h > @@ -317,6 +317,7 @@ enum > /* EDX. */ > #define bit_cpu_AVX_VNNI_INT8 (1u << 4) > #define bit_cpu_AVX_NE_CONVERT (1u << 5) > +#define bit_cpu_AMX_COMPLEX (1u << 8) > > /* CPUID_INDEX_19. */ > > @@ -558,6 +559,7 @@ enum > #define index_cpu_LAM CPUID_INDEX_7_ECX_1 > #define index_cpu_AVX_VNNI_INT8 CPUID_INDEX_7_ECX_1 > #define index_cpu_AVX_NE_CONVERT CPUID_INDEX_7_ECX_1 > +#define index_cpu_AMX_COMPLEX CPUID_INDEX_7_ECX_1 > > /* CPUID_INDEX_19. */ > > @@ -801,6 +803,7 @@ enum > /* EDX. */ > #define reg_AVX_VNNI_INT8 edx > #define reg_AVX_NE_CONVERT edx > +#define reg_AMX_COMPLEX edx > > /* CPUID_INDEX_19. */ > > diff --git a/sysdeps/x86/tst-get-cpu-features.c b/sysdeps/x86/tst-get-cpu= -features.c > index bb1b67fd1c..87fe27340f 100644 > --- a/sysdeps/x86/tst-get-cpu-features.c > +++ b/sysdeps/x86/tst-get-cpu-features.c > @@ -217,6 +217,7 @@ do_test (void) > CHECK_CPU_FEATURE_PRESENT (MSRLIST); > CHECK_CPU_FEATURE_PRESENT (AVX_VNNI_INT8); > CHECK_CPU_FEATURE_PRESENT (AVX_NE_CONVERT); > + CHECK_CPU_FEATURE_PRESENT (AMX_COMPLEX); > CHECK_CPU_FEATURE_PRESENT (AESKLE); > CHECK_CPU_FEATURE_PRESENT (WIDE_KL); > CHECK_CPU_FEATURE_PRESENT (PTWRITE); > @@ -386,6 +387,7 @@ do_test (void) > CHECK_CPU_FEATURE_ACTIVE (AVX_IFMA); > CHECK_CPU_FEATURE_ACTIVE (AVX_VNNI_INT8); > CHECK_CPU_FEATURE_ACTIVE (AVX_NE_CONVERT); > + CHECK_CPU_FEATURE_ACTIVE (AMX_COMPLEX); > CHECK_CPU_FEATURE_ACTIVE (AESKLE); > CHECK_CPU_FEATURE_ACTIVE (WIDE_KL); > CHECK_CPU_FEATURE_ACTIVE (PTWRITE); > -- > 2.39.2 > LGTM Reviewed-by: Noah Goldstein