From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by sourceware.org (Postfix) with ESMTPS id 7D70A3858C27 for ; Fri, 22 Sep 2023 03:22:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7D70A3858C27 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-59c0a7d54bdso20587977b3.1 for ; Thu, 21 Sep 2023 20:22:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695352974; x=1695957774; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GT+PyUU76PJNxl8s+VphKSNadG8cuy/6Ep4A4xkuQvs=; b=XpoyhfBjQcnLzidn8d3BbAFrnLbN49Azn34r5QKss4Gmuc23y3vcQX9hJCpKUsZOeo p/SG7w6V46T8kVqJjPfiUBUs52wrDJwJdZ8Mb1XXhEOA6PbK9pB/fVnp1mUkeVvOublF DxNhR9gJvX5li9b5YV/GyLl2/6QvYDgZrxoam/UsN1yVkyhII7WsH0IHWwdMtDIvMCK8 Q9pqxX1zWQwfop4+QhJHZYQM+qzhyW13O4Ouu5XA+RRwvkoCQdwaujBf3RbuvRX0+Nfk 8aDftzb5tBSPPdNdOVaOzfalBCG8UmBTOwKC0l3UVvI2F5I/gr7+mZxyN632PCM62OWc i9EA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695352974; x=1695957774; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GT+PyUU76PJNxl8s+VphKSNadG8cuy/6Ep4A4xkuQvs=; b=qa2cGGCzYQbXH4Ttxcd0x5w6K82Ol89O4he4BS1rcJEPsDISXFcf5nTvD/M8ZiP/5g sRbKOoIU822eeuOA682ypmKKG5KMvZYWwEJoai8WeaCQvASzNUz7b6hB5eVoTaS0uyMV u8KQncAPvkjqG1jGo2bjU+6GTM66DaBgRpPuBAMZGZ7q0dkseI7Ku2rMBXs5NRVIDX+C hhmeyIcKEA8FxIOXsy4/5c6+bDDhU8g/0XESlCWHZ0wC0x6dGeuohoJV+SDXDJnxIK7Q ygIV/Aicjc5ppgpiVZ5o28KvCOxPq1roxtDppig2K8We5eh4PGnQXhBMocnIhZxHEVaX YizQ== X-Gm-Message-State: AOJu0YxIAXohrWjhfhgnjVwaTVK/4KySZpn5JJkF/9YLnQDPvTgm2NNZ HsuMp/YABkB5eHdhEoRN2uUcC87E4k9q6tpxwOk= X-Google-Smtp-Source: AGHT+IGQSzfvC+KlBvPZ29maya5G+c+VYvIG/O62In8IjbUtMSX8wKBamU6PctChDZ705zGqek0a5eLwIh1GL5shkx0= X-Received: by 2002:a0d:e808:0:b0:59a:d94d:566d with SMTP id r8-20020a0de808000000b0059ad94d566dmr6868817ywe.21.1695352973665; Thu, 21 Sep 2023 20:22:53 -0700 (PDT) MIME-Version: 1.0 References: <20230921072013.2124750-1-lin1.hu@intel.com> In-Reply-To: <20230921072013.2124750-1-lin1.hu@intel.com> From: Hongtao Liu Date: Fri, 22 Sep 2023 11:30:50 +0800 Message-ID: Subject: Re: [PATCH 00/18] Support -mevex512 for AVX512 To: "Hu, Lin1" Cc: gcc-patches@gcc.gnu.org, hongtao.liu@intel.com, ubizjak@gmail.com, haochen.jiang@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_NUMSUBJECT,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Sep 21, 2023 at 3:22=E2=80=AFPM Hu, Lin1 wrote: > > Hi all, > > After previous discussion, instead of supporting option -mavx10.1, we > will first introduct option -m[no-]evex512, which will enable/disable > 512 bit register and 64 bit mask register. > > It will not change the current option behavior since if AVX512F is > enabled with no evex512 option specified, it will automatically enable > 512 bit register and 64 bit mask register. > > How the patches go comes following: > > Patch 1 added initial support for option -mevex512. > > Patch 2-6 refined current intrin file to push evex512 target for all > 512 bit intrins. Those scalar intrins remained untouched. > > Patch 7-11 added OPTION_MASK_ISA2_EVEX512 for all related builtins. > > Patch 12 disabled zmm register, 512 bit libmvec call for no-evex512, > also requested evex512 for vectorization when using 512 bit register. > > Patch 13-17 supported evex512 in related patterns. > > Patch 18 added testcases for -mno-evex512 and allowed its usage. > > The patches currently cause scan-asm fail for pr89229-{5,6,7}b.c since > we will emit scalar vmovss here. When trying to use x/ymm 16+ w/o > avx512vl but with avx512f+evex512, I suppose we could either emit scalar > or zmm instructions. It is quite a rare case on HW since there is no > HW w/o avx512vl but with avx512f, so I prefer to not to add maintainence > effort here to get a slightly perf improvement. But it could be changed > to former behavior. To make it easier for people to test before committing, I pushed the patch to the vendor branch refs/vendors/ix86/heads/evex512. Welcome to try it out. > > Discussions are welcomed for all the patches. > > Thx, > Haochen > > Haochen Jiang (18): > Initial support for -mevex512 > Push evex512 target for 512 bit intrins > Push evex512 target for 512 bit intrins > Push evex512 target for 512 bit intrins > Push evex512 target for 512 bit intrins > Push evex512 target for 512 bit intrins > Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins > Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins > Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins > Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins > Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins > Disable zmm register and 512 bit libmvec call when !TARGET_EVEX512 > Support -mevex512 for AVX512F intrins > Support -mevex512 for AVX512DQ intrins > Support -mevex512 for AVX512BW intrins > Support -mevex512 for > AVX512{IFMA,VBMI,VNNI,BF16,VPOPCNTDQ,VBMI2,BITALG,VP2INTERSECT},VAES,= GFNI,VPCLMULQDQ > intrins > Support -mevex512 for AVX512FP16 intrins > Allow -mno-evex512 usage > > gcc/common/config/i386/i386-common.cc | 15 + > gcc/config.gcc | 19 +- > gcc/config/i386/avx5124fmapsintrin.h | 2 +- > gcc/config/i386/avx5124vnniwintrin.h | 2 +- > gcc/config/i386/avx512bf16intrin.h | 31 +- > gcc/config/i386/avx512bitalgintrin.h | 155 +- > gcc/config/i386/avx512bitalgvlintrin.h | 180 + > gcc/config/i386/avx512bwintrin.h | 291 +- > gcc/config/i386/avx512dqintrin.h | 1840 +- > gcc/config/i386/avx512erintrin.h | 2 +- > gcc/config/i386/avx512fintrin.h | 19663 +++++++++--------- > gcc/config/i386/avx512fp16intrin.h | 8925 ++++---- > gcc/config/i386/avx512ifmaintrin.h | 4 +- > gcc/config/i386/avx512pfintrin.h | 2 +- > gcc/config/i386/avx512vbmi2intrin.h | 4 +- > gcc/config/i386/avx512vbmiintrin.h | 4 +- > gcc/config/i386/avx512vnniintrin.h | 4 +- > gcc/config/i386/avx512vp2intersectintrin.h | 4 +- > gcc/config/i386/avx512vpopcntdqintrin.h | 4 +- > gcc/config/i386/gfniintrin.h | 76 +- > gcc/config/i386/i386-builtin.def | 1312 +- > gcc/config/i386/i386-builtins.cc | 96 +- > gcc/config/i386/i386-c.cc | 2 + > gcc/config/i386/i386-expand.cc | 18 +- > gcc/config/i386/i386-options.cc | 33 +- > gcc/config/i386/i386.cc | 168 +- > gcc/config/i386/i386.h | 7 +- > gcc/config/i386/i386.md | 127 +- > gcc/config/i386/i386.opt | 4 + > gcc/config/i386/immintrin.h | 2 + > gcc/config/i386/predicates.md | 3 +- > gcc/config/i386/sse.md | 854 +- > gcc/config/i386/vaesintrin.h | 4 +- > gcc/config/i386/vpclmulqdqintrin.h | 4 +- > gcc/testsuite/gcc.target/i386/noevex512-1.c | 13 + > gcc/testsuite/gcc.target/i386/noevex512-2.c | 13 + > gcc/testsuite/gcc.target/i386/noevex512-3.c | 13 + > gcc/testsuite/gcc.target/i386/pr89229-5b.c | 2 +- > gcc/testsuite/gcc.target/i386/pr89229-6b.c | 2 +- > gcc/testsuite/gcc.target/i386/pr89229-7b.c | 2 +- > gcc/testsuite/gcc.target/i386/pr90096.c | 2 +- > 41 files changed, 17170 insertions(+), 16738 deletions(-) > create mode 100644 gcc/config/i386/avx512bitalgvlintrin.h > create mode 100644 gcc/testsuite/gcc.target/i386/noevex512-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/noevex512-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/noevex512-3.c > > -- > 2.31.1 > --=20 BR, Hongtao