From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id BF0003858D3C for ; Fri, 26 May 2023 03:34:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BF0003858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1685072092; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to; bh=0GQWnU1y+QH2uUMk3XwsWRGkqYjjU4YW0UK3y6A5f3M=; b=iIPCOv3zfNKvFRmvbEws6SL8lYeTGfTBLMuwOfD1AIX1ue6p9amNGJrJNCYL99pjlyakc2 NNINXn1zpD0EfQdGvO9mAygQ+3oerDBEX78jn+Ehy485PY5f+CimrwuNyM98UVK3cfv0+a 4GW7qkUp9XPkYdQ62BtyUVhUYO+MFss= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-373-munZKzIAPviF6GN9TbHsIQ-1; Thu, 25 May 2023 23:34:49 -0400 X-MC-Unique: munZKzIAPviF6GN9TbHsIQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 98C2D8032E4; Fri, 26 May 2023 03:34:48 +0000 (UTC) Received: from greed.delorie.com (unknown [10.22.9.251]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7BC022166B2B; Fri, 26 May 2023 03:34:48 +0000 (UTC) Received: from greed.delorie.com.redhat.com (localhost [127.0.0.1]) by greed.delorie.com (8.15.2/8.15.2) with ESMTP id 34Q3YlBO3681938; Thu, 25 May 2023 23:34:47 -0400 From: DJ Delorie To: Noah Goldstein Cc: libc-alpha@sourceware.org, goldstein.w.n@gmail.com, hjl.tools@gmail.com, carlos@systemhalted.org Subject: Re: [PATCH v9 2/3] x86: Refactor Intel `init_cpu_features` In-Reply-To: <20230513051906.1287611-2-goldstein.w.n@gmail.com> Date: Thu, 25 May 2023 23:34:47 -0400 Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: LGTM with a few nits about comments that I don't really care about, but mention for completeness. Reviewed-by: DJ Delorie Noah Goldstein via Libc-alpha writes: > This patch should have no affect on existing functionality. > +/* Intel Family-6 microarch list. */ > +enum > +{ > + /* Atom processors. */ > + INTEL_ATOM_BONNELL, > + INTEL_ATOM_SILVERMONT, > + INTEL_ATOM_AIRMONT, > + INTEL_ATOM_GOLDMONT, > + INTEL_ATOM_GOLDMONT_PLUS, > + INTEL_ATOM_SIERRAFOREST, > + INTEL_ATOM_GRANDRIDGE, > + INTEL_ATOM_TREMONT, > + > + /* Bigcore processors. */ > + INTEL_BIGCORE_MEROM, > + INTEL_BIGCORE_PENRYN, > + INTEL_BIGCORE_DUNNINGTON, > + INTEL_BIGCORE_NEHALEM, > + INTEL_BIGCORE_WESTMERE, > + INTEL_BIGCORE_SANDYBRIDGE, > + INTEL_BIGCORE_IVYBRIDGE, > + INTEL_BIGCORE_HASWELL, > + INTEL_BIGCORE_BROADWELL, > + INTEL_BIGCORE_SKYLAKE, > + INTEL_BIGCORE_KABYLAKE, > + INTEL_BIGCORE_COMETLAKE, > + INTEL_BIGCORE_SKYLAKE_AVX512, > + INTEL_BIGCORE_CANNONLAKE, > + INTEL_BIGCORE_ICELAKE, > + INTEL_BIGCORE_TIGERLAKE, > + INTEL_BIGCORE_ROCKETLAKE, > + INTEL_BIGCORE_SAPPHIRERAPIDS, > + INTEL_BIGCORE_RAPTORLAKE, > + INTEL_BIGCORE_EMERALDRAPIDS, > + INTEL_BIGCORE_METEORLAKE, > + INTEL_BIGCORE_LUNARLAKE, > + INTEL_BIGCORE_ARROWLAKE, > + INTEL_BIGCORE_GRANITERAPIDS, > + > + /* Mixed (bigcore + atom SOC). */ > + INTEL_MIXED_LAKEFIELD, > + INTEL_MIXED_ALDERLAKE, > + > + /* KNL. */ (Honestly, not a useful comment... ;-) > + INTEL_KNIGHTS_MILL, > + INTEL_KNIGHTS_LANDING, > + > + /* Unknown. */ > + INTEL_UNKNOWN, > +}; Ok. > +static unsigned int > +intel_get_fam6_microarch (unsigned int model, unsigned int stepping) stepping is never used, could it be marked notused or something? > +{ > + switch (model) > + { . . . > + } > +} Ok. > @@ -453,129 +662,143 @@ init_cpu_features (struct cpu_features *cpu_features) > if (family == 0x06) > { > model += extended_model; > - switch (model) > + unsigned int microarch > + = intel_get_fam6_microarch (model, stepping); > + > + switch (microarch) Ok. > { > - case 0x1c: > - case 0x26: > - /* BSF is slow on Atom. */ > + /* Atom / KNL tuning. */ > + case INTEL_ATOM_BONNELL: > + /* BSF is slow on Bonnell. */ Ok. > - case 0x57: > - /* Knights Landing. Enable Silvermont optimizations. */ > - > - case 0x7a: > - /* Unaligned load versions are faster than SSSE3 > - on Goldmont Plus. */ > - > - case 0x5c: > - case 0x5f: > /* Unaligned load versions are faster than SSSE3 > - on Goldmont. */ > + on Airmont, Silvermont, Goldmont, and Goldmont Plus. */ > + case INTEL_ATOM_AIRMONT: > + case INTEL_ATOM_SILVERMONT: > + case INTEL_ATOM_GOLDMONT: > + case INTEL_ATOM_GOLDMONT_PLUS: Ok. > - case 0x4c: > - case 0x5a: > - case 0x75: > - /* Airmont is a die shrink of Silvermont. */ > + /* Knights Landing. Enable Silvermont optimizations. */ > + case INTEL_KNIGHTS_LANDING: > > - case 0x37: > - case 0x4a: > - case 0x4d: > - case 0x5d: > - /* Unaligned load versions are faster than SSSE3 > - on Silvermont. */ Ok. > cpu_features->preferred[index_arch_Fast_Unaligned_Load] > - |= (bit_arch_Fast_Unaligned_Load > - | bit_arch_Fast_Unaligned_Copy > - | bit_arch_Prefer_PMINUB_for_stringop > - | bit_arch_Slow_SSE4_2); > + |= (bit_arch_Fast_Unaligned_Load > + | bit_arch_Fast_Unaligned_Copy > + | bit_arch_Prefer_PMINUB_for_stringop > + | bit_arch_Slow_SSE4_2); > break; Ok. > - case 0x86: > - case 0x96: > - case 0x9c: > + case INTEL_ATOM_TREMONT: Ok. > cpu_features->preferred[index_arch_Fast_Rep_String] > - |= (bit_arch_Fast_Rep_String > - | bit_arch_Fast_Unaligned_Load > - | bit_arch_Fast_Unaligned_Copy > - | bit_arch_Prefer_PMINUB_for_stringop > - | bit_arch_Slow_SSE4_2); > + |= (bit_arch_Fast_Rep_String | bit_arch_Fast_Unaligned_Load Minor nit: I think the one-per-line is easier to follow. Ok though. > + | bit_arch_Fast_Unaligned_Copy > + | bit_arch_Prefer_PMINUB_for_stringop > + | bit_arch_Slow_SSE4_2); Ok. > > + /* > + Default tuned Knights microarch. > + case INTEL_KNIGHTS_MILL: > + */ > + > + /* > + Default tuned atom microarch. > + case INTEL_ATOM_SIERRAFOREST: > + case INTEL_ATOM_GRANDRIDGE: > + */ > + > + /* Bigcore/Default Tuning. */ > default: Ok. > /* Unknown family 0x06 processors. Assuming this is one > of Core i3/i5/i7 processors if AVX is available. */ > if (!CPU_FEATURES_CPU_P (cpu_features, AVX)) > break; > - /* Fall through. */ This should stay, as we're still falling through. > - case 0x1a: > - case 0x1e: > - case 0x1f: > - case 0x25: > - case 0x2c: > - case 0x2e: > - case 0x2f: > + case INTEL_BIGCORE_NEHALEM: > + case INTEL_BIGCORE_WESTMERE: Ok. > /* Rep string instructions, unaligned load, unaligned copy, > and pminub are fast on Intel Core i3, i5 and i7. */ > cpu_features->preferred[index_arch_Fast_Rep_String] > - |= (bit_arch_Fast_Rep_String > - | bit_arch_Fast_Unaligned_Load > - | bit_arch_Fast_Unaligned_Copy > - | bit_arch_Prefer_PMINUB_for_stringop); > + |= (bit_arch_Fast_Rep_String | bit_arch_Fast_Unaligned_Load > + | bit_arch_Fast_Unaligned_Copy > + | bit_arch_Prefer_PMINUB_for_stringop); Same nit as before. > break; > + > + /* > + Default tuned Bigcore microarch. This is a bit confusing, because these cases are not actually active here. It seems to mean "these CPUs will end up here due to logic above" but why not make that explicit, either by saying so, or activating these cases? [future me: they're used in 3/3, so ok] > + case INTEL_BIGCORE_SANDYBRIDGE: > + case INTEL_BIGCORE_IVYBRIDGE: > + case INTEL_BIGCORE_HASWELL: > + case INTEL_BIGCORE_BROADWELL: > + case INTEL_BIGCORE_SKYLAKE: > + case INTEL_BIGCORE_KABYLAKE: > + case INTEL_BIGCORE_COMETLAKE: > + case INTEL_BIGCORE_SKYLAKE_AVX512: > + case INTEL_BIGCORE_CANNONLAKE: > + case INTEL_BIGCORE_ICELAKE: > + case INTEL_BIGCORE_TIGERLAKE: > + case INTEL_BIGCORE_ROCKETLAKE: > + case INTEL_BIGCORE_RAPTORLAKE: > + case INTEL_BIGCORE_METEORLAKE: > + case INTEL_BIGCORE_LUNARLAKE: > + case INTEL_BIGCORE_ARROWLAKE: > + case INTEL_BIGCORE_SAPPHIRERAPIDS: > + case INTEL_BIGCORE_EMERALDRAPIDS: > + case INTEL_BIGCORE_GRANITERAPIDS: > + */ > + > + /* > + Default tuned Mixed (bigcore + atom SOC). > + case INTEL_MIXED_LAKEFIELD: > + case INTEL_MIXED_ALDERLAKE: > + */ > } Ok. > - /* Disable TSX on some processors to avoid TSX on kernels that > - weren't updated with the latest microcode package (which > - disables broken feature by default). */ > - switch (model) > + /* Disable TSX on some processors to avoid TSX on kernels that > + weren't updated with the latest microcode package (which > + disables broken feature by default). */ > + switch (microarch) Ok. > - case 0x55: > + case INTEL_BIGCORE_SKYLAKE_AVX512: > + /* 0x55 (Skylake-avx512) && stepping <= 5 disable TSX. */ > if (stepping <= 5) > goto disable_tsx; Ok. > - case 0x8e: > - /* NB: Although the errata documents that for model == 0x8e, > - only 0xb stepping or lower are impacted, the intention of > - the errata was to disable TSX on all client processors on > - all steppings. Include 0xc stepping which is an Intel > - Core i7-8665U, a client mobile processor. */ > - case 0x9e: > - if (stepping > 0xc) > + > + case INTEL_BIGCORE_SKYLAKE: > + case INTEL_BIGCORE_KABYLAKE: > + /* NB: Although the errata documents that for model == 0x8e > + (skylake client), only 0xb stepping or lower are impacted, > + the intention of the errata was to disable TSX on all client > + processors on all steppings. Include 0xc stepping which is > + an Intel Core i7-8665U, a client mobile processor. */ > + if ((model == 0x8e || model == 0x9e) && stepping > 0xc) Could have just used INTEL_BIGCORE_KABYLAKE instead of model numbers here, but ok. > break; > - /* Fall through. */ > - case 0x4e: > - case 0x5e: > - { > + Ok. > processors listed in: > > https://www.intel.com/content/www/us/en/support/articles/000059422/processors.html > */ > -disable_tsx: > + disable_tsx: > CPU_FEATURE_UNSET (cpu_features, HLE); > CPU_FEATURE_UNSET (cpu_features, RTM); > CPU_FEATURE_SET (cpu_features, RTM_ALWAYS_ABORT); > - } > - break; Matches brace removal above. Ok. > - case 0x3f: > - /* Xeon E7 v3 with stepping >= 4 has working TSX. */ > - if (stepping >= 4) > break; > - /* Fall through. */ > - case 0x3c: > - case 0x45: > - case 0x46: > - /* Disable Intel TSX on Haswell processors (except Xeon E7 v3 > - with stepping >= 4) to avoid TSX on kernels that weren't > - updated with the latest microcode package (which disables > - broken feature by default). */ > - CPU_FEATURE_UNSET (cpu_features, RTM); > - break; > + > + case INTEL_BIGCORE_HASWELL: > + /* Xeon E7 v3 (model == 0x3f) with stepping >= 4 has working > + TSX. Haswell also include other model numbers that have > + working TSX. */ > + if (model == 0x3f && stepping >= 4) > + break; > + > + CPU_FEATURE_UNSET (cpu_features, RTM); > + break; > } > } Ok.