From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) by sourceware.org (Postfix) with ESMTPS id 2C4B43858D37 for ; Thu, 31 Aug 2023 09:20:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2C4B43858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x233.google.com with SMTP id 38308e7fff4ca-2bb97f2c99cso11061271fa.0 for ; Thu, 31 Aug 2023 02:20:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1693473644; x=1694078444; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2/vZMwJfds9oujW2CiskgeYwCA3gyUTapluxWNhcUIs=; b=LoKj/aZr5SxSKCK1Cg3m7y8Ya3kq4HnscU2yJJv0da9bfGQ7pby/YkjKde0Tyx0XHw Ju2JB26iXbjTw3AYPMB0OSypfJV2XcwB+SVGq7SRFKaXmLare7S4AUa+8U2X1wuYLyWW /SKaWq83S2c7/W3hzbt62T6TnPymd4x6bzdR2b8UBbmnfx6CZpg081UFCTSUxXNJV49E dN9rM/gUPu5zPPxPcgy3G0XaXi7LuBg4z1he2Wc7u21E6aW/QzofYwlnTGyn4f/Vj1b0 9MZr/P9ArBZfa8Aczp97ujNN+lJYkgJgTXJ0PxnvHavD0jXnD62PhDOj3Lzw8iJkA2VF UegQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693473644; x=1694078444; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2/vZMwJfds9oujW2CiskgeYwCA3gyUTapluxWNhcUIs=; b=bi6uV7mVjrbx/XUx7U9xJxRSLUdis+Vkpj48ZymnHGCnVJW2lPA2yFVZlx8nxqzg9H dchpix+1Hznz5InpImgvW+D9eqzrWGah/wbgz/QHCdr62IEEFmilnVBbp5fIEmBIg1Wp o4wJtBeZ1lJAFDVYEGttSluFy2Z54LLGZH7gtimFnxjPod2fSPhPJiJ/CZhF2FouE7vp PFwlsNGH6a3NtoyJoVUVNftuooEYIWNI3BLVKhR1Bzrah1CR5w+WQTugRF3R7j251+f+ DFsRr+FFJ/mAS0cISW6PsctPETA5ZmA/9cfNHvmxGrYLLZIlRufli46gU0K8I5qijowd w6Sg== X-Gm-Message-State: AOJu0YyxpRBe8Vw3oTgA+loai1hsUe+xuF7eTVPJHFBhuW63AAs2Lj/Y N0Nw99BJsSzRIdEwpTsbX6GohuIWA4gnnFa7/xuRMZD1 X-Google-Smtp-Source: AGHT+IGI8ttWZ5mmgVWJJIlwDZZ0XgBiDIAQ4+gAc4BCrTtCGX/B0YS9DMgewRfeGQyOAdvx6hBWRDVK5GlCiS4h+ds= X-Received: by 2002:a2e:90c8:0:b0:2bd:d34:d98a with SMTP id o8-20020a2e90c8000000b002bd0d34d98amr3473716ljg.44.1693473643342; Thu, 31 Aug 2023 02:20:43 -0700 (PDT) MIME-Version: 1.0 References: <20230831082024.314097-1-hongyu.wang@intel.com> In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> From: Richard Biener Date: Thu, 31 Aug 2023 11:19:00 +0200 Message-ID: Subject: Re: [PATCH 00/13] [RFC] Support Intel APX EGPR To: Hongyu Wang Cc: gcc-patches@gcc.gnu.org, jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Aug 31, 2023 at 10:22=E2=80=AFAM Hongyu Wang via Gcc-patches wrote: > > Intel Advanced performance extension (APX) has been released in [1]. > It contains several extensions such as extended 16 general purpose regist= ers > (EGPRs), push2/pop2, new data destination (NDD), conditional compare > (CCMP/CTEST) combined with suppress flags write version of common instruc= tions > (NF). This RFC focused on EGPR implementation in GCC. > > APX introduces a REX2 prefix to help represent EGPR for several legacy/SS= E > instructions. For the remaining ones, it promotes some of them using evex > prefix for EGPR. The main issue in APX is that not all legacy/sse/vex > instructions support EGPR. For example, instructions in legacy opcode map= 2/3 > cannot use REX2 prefix since there is only 1bit in REX2 to indicate map0/= 1 > instructions, e.g., pinsrd. Also, for most vector extensions, EGPR is sup= ported > in their evex forms but not vex forms, which means the mnemonics with no = evex > forms also cannot use EGPR, e.g., vphaddw. > > Such limitation brings some challenge with current GCC infrastructure. > Generally, we use constraints to guide register allocation behavior. For > register operand, it is easy to add a new constraint to certain insn and = limit > it to legacy or REX registers. But for memory operand, if we only use > constraint to limit base/index register choice, reload has no backoff whe= n > process_address allocates any egprs to base/index reg, and then any post-= reload > pass would get ICE from the constraint. How realistic would it be to simply disable instructions not supporting EGP= R? I hope there are alternatives that would be available in actual APX implementations? Otherwise this design limitation doesn't shed a very positive light on the designers ... How sure are we actual implementations with APX will appear (just remembering SSE5...)? I'm quite sure it's not going to be 2024 so would it be realistic to post-pone APX work to next stage1, targeting GCC 15 only? > Here is what we did to address the issue: > > Middle-end: > - Add rtx_insn parameter to base_reg_class, reuse the > MODE_CODE_BASE_REG_CLASS macro with rtx_insn parameter. > - Add index_reg_class like base_reg_class, calls new INSN_INDEX_REG= _CLASS > macro with rtx_insn parameter. > - In process_address_1, add rtx_insn parameter to call sites of > base_reg_class, replace usage of INDEX_REG_CLASS to index_reg_class with > rtx_insn parameter. > > Back-end: > - Extend GENERAL_REG_CLASS, INDEX_REG_CLASS and their supersets wit= h > corresponding regno checks for EGPRs. > - Add GENERAL_GPR16/INDEX_GPR16 class for old 16 GPRs. > - Whole component is controlled under -mapxf/TARGET_APX_EGPR. If it= is > not enabled, clear r16-r31 in accessible_reg_set. > - New register_constraint =E2=80=9Ch=E2=80=9D and memory_constraint= =E2=80=9CBt=E2=80=9D that disallows > EGPRs in operand. > - New asm_gpr32 flag option to enable/disable gpr32 for inline asm, > disabled by default. > - If asm_gpr32 is disabled, replace constraints =E2=80=9Cr=E2=80=9D= to =E2=80=9Ch=E2=80=9D, and > =E2=80=9Cm/memory=E2=80=9D to =E2=80=9CBt=E2=80=9D. > - Extra insn attribute gpr32, value 0 indicates the alternative can= not > use EGPRs. > - Add target functions for base_reg_class and index_reg_class, call= s a > helper function to verify if insn can use EGPR in its memory_operand. > - In the helper function, the verify process works as follow: > 1. Returns true if APX_EGPR disabled or insn is null. > 2. If the insn is inline asm, returns asm_gpr32 flag. > 3. Returns false for unrecognizable insn. > 4. Save recog_data and which_alternative, extract the insn, and resto= re them > before return. > 5. Loop through all enabled alternatives, if one of the enabled alter= natives > have attr_gpr32 0, returns false, otherwise returns true. > - For insn alternatives that cannot use gpr32 in register_operand, = use h > constraint instead of r. > - For insn alternatives that cannot use gpr32 in memory operand, us= e Bt > constraint instead of m, and set corresponding attr_gpr32 to 0. > - Split output template with %v if the sse version of mnemonic cann= ot use > gpr32. > - For insn alternatives that cannot use gpr32 in memory operand, cl= assify > the isa attribute and split alternatives to noavx, avx_noavx512f and etc.= , so > the helper function can properly loop through the available enabled mask. > > Specifically for inline asm, we currently just map =E2=80=9Cr/m/memory=E2= =80=9D constraints as > an example. Eventually we will support entire mapping of all common const= raints > if the mapping method was accepted. > > Also, for vex instructions, currently we assume egpr was supported if the= y have > evex counterpart, since any APX enabled machine will have AVX10 support f= or all > the evex encodings. We just disabled those mnemonics that doesn=E2=80=99t= support EGPR. > So EGPR will be allowed under -mavx2 -mapxf for many vex mnemonics. > > We haven=E2=80=99t disabled EGPR for 3DNOW/XOP/LWP/FMA4/TBM instructions,= as they will > be co-operated with -mapxf. We can disable EGPR for them if AMD guys requ= ires. I think most of these are retired by now, so it's unlikely an implementation providing these and also APX will appear. I have no comments on the implementation other than having instructions that do not support the upper GPRs is quite ugly. I don't know of any othe= r target with this kind of restriction, if there is any we could see how it d= eals with such situation. Richard. > For testing, currently we tested GCC testsuite and spec2017 with -maxf+sd= e > simulater and no more errors. Also, we inverted the register allocation o= rder > to force r31 to be allocated first, and no more error except those AMD on= ly > instructions. We will conduct further tests like changing all do-compile = to > do-assemble and add more to gcc/testsuite in the future. > > The RFC intends to describe our approach for APX implementation for EGPR > component. It may still have potential issues or bugs and requires futher > optimization. Any comments are very appreciated. > > [1]. https://www.intel.com/content/www/us/en/developer/articles/technical= /advanced-performance-extensions-apx.html. > > Hongyu Wang (2): > [APX EGPR] middle-end: Add index_reg_class with insn argument. > [APX EGPR] Handle GPR16 only vector move insns > > Kong Lingling (11): > [APX EGPR] middle-end: Add insn argument to base_reg_class > [APX_EGPR] Initial support for APX_F > [APX EGPR] Add 16 new integer general purpose registers > [APX EGPR] Add register and memory constraints that disallow EGPR > [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR > constraint. > [APX EGPR] Add backend hook for base_reg_class/index_reg_class. > [APX EGPR] Handle legacy insn that only support GPR16 (1/5) > [APX EGPR] Handle legacy insns that only support GPR16 (2/5) > [APX EGPR] Handle legacy insns that only support GPR16 (3/5) > [APX_EGPR] Handle legacy insns that only support GPR16 (4/5) > [APX EGPR] Handle vex insns that only support GPR16 (5/5) > > gcc/addresses.h | 25 +- > gcc/common/config/i386/cpuinfo.h | 12 +- > gcc/common/config/i386/i386-common.cc | 17 + > gcc/common/config/i386/i386-cpuinfo.h | 1 + > gcc/common/config/i386/i386-isas.h | 1 + > gcc/config/avr/avr.h | 5 +- > gcc/config/gcn/gcn.h | 4 +- > gcc/config/i386/constraints.md | 26 +- > gcc/config/i386/cpuid.h | 1 + > gcc/config/i386/i386-isa.def | 1 + > gcc/config/i386/i386-options.cc | 15 + > gcc/config/i386/i386-opts.h | 8 + > gcc/config/i386/i386-protos.h | 9 + > gcc/config/i386/i386.cc | 253 +++++- > gcc/config/i386/i386.h | 69 +- > gcc/config/i386/i386.md | 144 ++- > gcc/config/i386/i386.opt | 30 + > gcc/config/i386/mmx.md | 170 ++-- > gcc/config/i386/sse.md | 859 ++++++++++++------ > gcc/config/rl78/rl78.h | 6 +- > gcc/doc/invoke.texi | 11 +- > gcc/doc/tm.texi | 17 +- > gcc/doc/tm.texi.in | 17 +- > gcc/lra-constraints.cc | 32 +- > gcc/reload.cc | 34 +- > gcc/reload1.cc | 2 +- > gcc/testsuite/gcc.target/i386/apx-1.c | 8 + > .../gcc.target/i386/apx-egprs-names.c | 17 + > .../gcc.target/i386/apx-inline-gpr-norex2.c | 108 +++ > .../gcc.target/i386/apx-interrupt-1.c | 102 +++ > .../i386/apx-legacy-insn-check-norex2-asm.c | 5 + > .../i386/apx-legacy-insn-check-norex2.c | 181 ++++ > .../gcc.target/i386/apx-spill_to_egprs-1.c | 25 + > gcc/testsuite/lib/target-supports.exp | 10 + > 34 files changed, 1747 insertions(+), 478 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/apx-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/apx-egprs-names.c > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > create mode 100644 gcc/testsuite/gcc.target/i386/apx-interrupt-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-n= orex2-asm.c > create mode 100644 gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-n= orex2.c > create mode 100644 gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c > > -- > 2.31.1 >