From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1133.google.com (mail-yw1-x1133.google.com [IPv6:2607:f8b0:4864:20::1133]) by sourceware.org (Postfix) with ESMTPS id 1A6693858D28 for ; Fri, 24 Nov 2023 01:04:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1A6693858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1A6693858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1133 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700787843; cv=none; b=F2F6ZZo7Op97gRugBf1TLQndv8cIcdOcnrSKXEuLA0Sy2HNR3VU5ouwSErctQ9IgT5z+OpLIXAcSkeqYECrh49P5E86xg5tjS97WEbQPuZJMiHT+0fLRVwk+OjTAvhnPp1cl1Ev4t3GIGD9jcPalG5Gm+BJKOl6wLRhF53+XV0I= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700787843; c=relaxed/simple; bh=feliGm/IEy8+dYupfDDktot1J6gwdyAmWe1mJviSKy8=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=R+EluFx++X11UqhJ7GvNHqJzYccaRfFmwNf5pIiyH5wcXYVgkgs9hERDnnolWwf8yW5SXxC0SKpkqE/TmlUx2kbIMZrSDsky18xCCE/GGKHMCHXyujTwigIVeCz6lnsitqG9olg+JeDwJQ5KVzD8b+QTNIWmGKIgiz+5o1a4tC8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x1133.google.com with SMTP id 00721157ae682-5cc77e23218so14076717b3.3 for ; Thu, 23 Nov 2023 17:04:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700787840; x=1701392640; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=d8O83H4tFpejbxw8mqLwkZJwATNJCoLAHy9rUCvhcG8=; b=ZLFrKDw6DCuT4U7tf4vOmAnLgAi/tnXhdvL6DcRx6WRpx21ej5yJR6bbSMcOgS7xvm K9JebgMXBXOrwIel6eh+YoSQRnIEFNQywU0XUL0puT6xcPqfcvBKKEHvdYdbBqG5FaHq epMKBG26LVSWsEXE2hj9pX0WLM48hRHCFFSdHnrrm2S14zUaWNZf8hyIes8+NSLTzcyR KU3u35S/08DUSUPNF9itEaAjcr+HAr0dFFbTlnOYoo67GVxS7Io8dmanlg6Wd6hCtnPT UUExgewUlJxr436ocfW7xLgtVm+tsQnj5I82+ssbHgkrLVKbc0qUhmmN4fJr3q9OYycB LXyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700787840; x=1701392640; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d8O83H4tFpejbxw8mqLwkZJwATNJCoLAHy9rUCvhcG8=; b=sQVB+JuruWYbGCfzV/LnvDDiz8KqR9q2pRBJrn9WB37tcEQTQYyP4UeyZd21lOE+PK PKD7syaOKKTxCxjWbxiaN/Sr5NOHl53ivvgiewc0dEQac/BbXDnnd0oyn5o6mFnq7186 h+WoxIJPR09ty0ulQHS0TAzzyHQzOG7/zHaB/1q6Veeeb3zaZyExPh5NXOnyKpu8/xG+ WCeefMYYTCXHWx6siB7gaIiRrGnF18jJzfos38MQFJpi+C68rT9orGRoL23g2P6QmThP JfxpV6esAqRUZQGiHdqnTXEsQjmZ4O73usaQ5y/iKniKVmKvW9rzlV0FptzKGDFykECf 7ZHw== X-Gm-Message-State: AOJu0YwfiXrmRE9Rb0wTUO/E9gMszBiPpLLEbwi4o2UpbBQxFT5xy5cH zgPyJ9i3U2R6fvOR/G21zkR+zoX/nPRIzMl2RWo= X-Google-Smtp-Source: AGHT+IES/UV5wWFIkrr4Y2FIl3+8pBdqpi5xw2ZdHqEpihHZLyaLgTPBtGPnUGaKSZQpkofJBrGYyPCksODJxrE8DYw= X-Received: by 2002:a81:d249:0:b0:5cb:794e:3b0a with SMTP id m9-20020a81d249000000b005cb794e3b0amr1118865ywl.33.1700787840344; Thu, 23 Nov 2023 17:04:00 -0800 (PST) MIME-Version: 1.0 References: <20231123060949.618089-1-haochen.jiang@intel.com> In-Reply-To: <20231123060949.618089-1-haochen.jiang@intel.com> From: Hongtao Liu Date: Fri, 24 Nov 2023 09:03:48 +0800 Message-ID: Subject: Re: [PATCH] i386: Fix AVX512 and AVX10 option issues To: Haochen Jiang Cc: gcc-patches@gcc.gnu.org, hongtao.liu@intel.com, ubizjak@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Nov 23, 2023 at 2:10=E2=80=AFPM Haochen Jiang wrote: > > Hi all, > > This patch should be able to fix the current issue mentioned in PR112643. > > Also, I fixed some legacy issues in code related to AVX512/AVX10. > > Ok for trunk? Ok > > Thx, > Haochen > > gcc/ChangeLog: > > PR target/112643 > * config/i386/driver-i386.cc (check_avx10_avx512_features): > Renamed to ... > (check_avx512_features): this and remove avx10 check. > (host_detect_local_cpu): Never append -mno-avx10.1-{256,512} to > avoid emitting warnings when building GCC with native arch. > * config/i386/i386-builtin.def (BDESC): Add missing AVX512VL for > 128/256 bit builtin for AVX512VP2INTERSECT. > * config/i386/i386-options.cc (ix86_option_override_internal): > Also check whether the AVX512 flags is set when trying to reset. > * config/i386/i386.h > (PTA_SKYLAKE_AVX512): Add missing PTA_EVEX512. > (PTA_ZNVER4): Ditto. > --- > gcc/config/i386/driver-i386.cc | 19 +++++++++---------- > gcc/config/i386/i386-builtin.def | 8 ++++---- > gcc/config/i386/i386-options.cc | 8 +++++--- > gcc/config/i386/i386.h | 4 ++-- > 4 files changed, 20 insertions(+), 19 deletions(-) > > diff --git a/gcc/config/i386/driver-i386.cc b/gcc/config/i386/driver-i386= .cc > index ae67efc49c3..204600e128a 100644 > --- a/gcc/config/i386/driver-i386.cc > +++ b/gcc/config/i386/driver-i386.cc > @@ -377,15 +377,10 @@ detect_caches_intel (bool xeon_mp, unsigned max_lev= el, > enabled and the other disabled. Add this function to avoid push "-mn= o-" > options under this scenario for -march=3Dnative. */ > > -bool check_avx10_avx512_features (__processor_model &cpu_model, > - unsigned int (&cpu_features2)[SIZE_OF_C= PU_FEATURES], > - const enum processor_features feature) > +bool check_avx512_features (__processor_model &cpu_model, > + unsigned int (&cpu_features2)[SIZE_OF_CPU_FEA= TURES], > + const enum processor_features feature) > { > - if (has_feature (FEATURE_AVX512F) > - && ((feature =3D=3D FEATURE_AVX10_1_256) > - || (feature =3D=3D FEATURE_AVX10_1_512))) > - return false; > - > if (has_feature (FEATURE_AVX10_1_256) > && ((feature =3D=3D FEATURE_AVX512F) > || (feature =3D=3D FEATURE_AVX512CD) > @@ -900,8 +895,12 @@ const char *host_detect_local_cpu (int argc, const c= har **argv) > options =3D concat (options, " ", > isa_names_table[i].option, NULL); > } > - else if (check_avx10_avx512_features (cpu_model, cpu_features= 2, > - isa_names_table[i].feat= ure)) > + /* Never push -mno-avx10.1-{256,512} under -march=3Dnative to > + avoid unnecessary warnings when building librarys. */ > + else if ((isa_names_table[i].feature !=3D FEATURE_AVX10_1_256= ) > + && (isa_names_table[i].feature !=3D FEATURE_AVX10_1_= 512) > + && check_avx512_features (cpu_model, cpu_features2, > + isa_names_table[i].feature= )) > options =3D concat (options, neg_option, > isa_names_table[i].option + 2, NULL); > } > diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-buil= tin.def > index 19fa5c107c7..7a5f2676999 100644 > --- a/gcc/config/i386/i386-builtin.def > +++ b/gcc/config/i386/i386-builtin.def > @@ -301,10 +301,10 @@ BDESC (OPTION_MASK_ISA_AVX512BW, OPTION_MASK_ISA2_E= VEX512, CODE_FOR_avx512bw_sto > /* AVX512VP2INTERSECT */ > BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT | OPTION_MASK_ISA2_EVEX512= , CODE_FOR_nothing, "__builtin_ia32_2intersectd512", IX86_BUILTIN_2INTERSEC= TD512, UNKNOWN, (int) VOID_FTYPE_PUHI_PUHI_V16SI_V16SI) > BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT | OPTION_MASK_ISA2_EVEX512= , CODE_FOR_nothing, "__builtin_ia32_2intersectq512", IX86_BUILTIN_2INTERSEC= TQ512, UNKNOWN, (int) VOID_FTYPE_PUQI_PUQI_V8DI_V8DI) > -BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CODE_FOR_nothing, "__buil= tin_ia32_2intersectd256", IX86_BUILTIN_2INTERSECTD256, UNKNOWN, (int) VOID_= FTYPE_PUQI_PUQI_V8SI_V8SI) > -BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CODE_FOR_nothing, "__buil= tin_ia32_2intersectq256", IX86_BUILTIN_2INTERSECTQ256, UNKNOWN, (int) VOID_= FTYPE_PUQI_PUQI_V4DI_V4DI) > -BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CODE_FOR_nothing, "__buil= tin_ia32_2intersectd128", IX86_BUILTIN_2INTERSECTD128, UNKNOWN, (int) VOID_= FTYPE_PUQI_PUQI_V4SI_V4SI) > -BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CODE_FOR_nothing, "__buil= tin_ia32_2intersectq128", IX86_BUILTIN_2INTERSECTQ128, UNKNOWN, (int) VOID_= FTYPE_PUQI_PUQI_V2DI_V2DI) > +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CO= DE_FOR_nothing, "__builtin_ia32_2intersectd256", IX86_BUILTIN_2INTERSECTD25= 6, UNKNOWN, (int) VOID_FTYPE_PUQI_PUQI_V8SI_V8SI) > +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CO= DE_FOR_nothing, "__builtin_ia32_2intersectq256", IX86_BUILTIN_2INTERSECTQ25= 6, UNKNOWN, (int) VOID_FTYPE_PUQI_PUQI_V4DI_V4DI) > +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CO= DE_FOR_nothing, "__builtin_ia32_2intersectd128", IX86_BUILTIN_2INTERSECTD12= 8, UNKNOWN, (int) VOID_FTYPE_PUQI_PUQI_V4SI_V4SI) > +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CO= DE_FOR_nothing, "__builtin_ia32_2intersectq128", IX86_BUILTIN_2INTERSECTQ12= 8, UNKNOWN, (int) VOID_FTYPE_PUQI_PUQI_V2DI_V2DI) > > /* AVX512VL */ > BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_= avx512vl_loadv16hi_mask, "__builtin_ia32_loaddquhi256_mask", IX86_BUILTIN_L= OADDQUHI256_MASK, UNKNOWN, (int) V16HI_FTYPE_PCSHORT_V16HI_UHI) > diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-optio= ns.cc > index dd5df559c84..a41bfe546b9 100644 > --- a/gcc/config/i386/i386-options.cc > +++ b/gcc/config/i386/i386-options.cc > @@ -2691,10 +2691,12 @@ ix86_option_override_internal (bool main_args_p, > { > opts->x_ix86_isa_flags =3D (~avx512_isa_flags > & opts->x_ix86_isa_flags) > - | (avx512_isa_flags & opts->x_ix86_isa_flags_explicit); > - opts->x_ix86_isa_flags2 =3D (~avx512_isa_flags > + | (avx512_isa_flags & opts->x_ix86_isa_flags > + & opts->x_ix86_isa_flags_explicit); > + opts->x_ix86_isa_flags2 =3D (~avx512_isa_flags2 > & opts->x_ix86_isa_flags2) > - | (avx512_isa_flags2 & opts->x_ix86_isa_flags2_explicit); > + | (avx512_isa_flags2 & opts->x_ix86_isa_flags2 > + & opts->x_ix86_isa_flags2_explicit); > } > } > } > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > index 9c74b3ebd90..47340c6a4ad 100644 > --- a/gcc/config/i386/i386.h > +++ b/gcc/config/i386/i386.h > @@ -2375,7 +2375,7 @@ constexpr wide_int_bitmask PTA_SKYLAKE =3D PTA_BROA= DWELL | PTA_AES > | PTA_CLFLUSHOPT | PTA_XSAVEC | PTA_XSAVES | PTA_SGX; > constexpr wide_int_bitmask PTA_SKYLAKE_AVX512 =3D PTA_SKYLAKE | PTA_AVX5= 12F > | PTA_AVX512CD | PTA_AVX512VL | PTA_AVX512BW | PTA_AVX512DQ | PTA_PKU > - | PTA_CLWB; > + | PTA_CLWB | PTA_EVEX512; > constexpr wide_int_bitmask PTA_CASCADELAKE =3D PTA_SKYLAKE_AVX512 > | PTA_AVX512VNNI; > constexpr wide_int_bitmask PTA_COOPERLAKE =3D PTA_CASCADELAKE | PTA_AVX5= 12BF16; > @@ -2441,7 +2441,7 @@ constexpr wide_int_bitmask PTA_ZNVER3 =3D PTA_ZNVER= 2 | PTA_VAES | PTA_VPCLMULQDQ > constexpr wide_int_bitmask PTA_ZNVER4 =3D PTA_ZNVER3 | PTA_AVX512F | PTA= _AVX512DQ > | PTA_AVX512IFMA | PTA_AVX512CD | PTA_AVX512BW | PTA_AVX512VL > | PTA_AVX512BF16 | PTA_AVX512VBMI | PTA_AVX512VBMI2 | PTA_GFNI > - | PTA_AVX512VNNI | PTA_AVX512BITALG | PTA_AVX512VPOPCNTDQ; > + | PTA_AVX512VNNI | PTA_AVX512BITALG | PTA_AVX512VPOPCNTDQ | PTA_EVEX51= 2; > > constexpr wide_int_bitmask PTA_LUJIAZUI =3D PTA_64BIT | PTA_MMX | PTA_SS= E | PTA_SSE2 > | PTA_SSE3 | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 = | PTA_AES > -- > 2.31.1 > --=20 BR, Hongtao