From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by sourceware.org (Postfix) with ESMTPS id 95178385741F for ; Fri, 25 Jun 2021 12:40:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 95178385741F Received: by mail-pf1-x42f.google.com with SMTP id a127so7922960pfa.10 for ; Fri, 25 Jun 2021 05:40:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=L0A+x6qEVbHcuSBmNvS1Fb3uOAAugLlkg7n1sdxR2HY=; b=ZCNJ1ohUrhYCGx0IUkIEslX9X4bD6bRkdygC9RffHfc/FipxT9PUdo5wZHbW8uFKa8 gbyC5KQ5ehlZXmCr1UMSh9NHGX1iUtAE39GeSKPpkT1NYoiUuGvPOZaoyqz9+wYVEgOF 0gHYMWy2adxuLvliLH817y2ErI6VFZVmWuCWOpDYqQp0t2FfhdRQPL4sI2D/njvMlsjZ vgUc5kgieOpz7zNi3im2EMPbK/Raw/eUk1M86NBUpdvW+Pr/OXV92E8HvTAxApnpROzN HMi+q582kdnIdxJYscnWPU5Vuijs+MWgK3jImi3HjLC1P4nOpXLYEff5KlSHShFW4OJK R1UQ== X-Gm-Message-State: AOAM532JVWgebKTbBHTNgd2eHXuL20c8viEvqZ8sJQWyIKyEFVvDZXJw KehGeHr+YLQvfoLuStwoe2I/mLTpBP5hrHZvtc4= X-Google-Smtp-Source: ABdhPJyam5ZQQNo+DDrbDyFBDhQLNBCC77AyCku+D4drxlFH0r7ykNiyytzqRZJSzuNNP7hLSVxuZoWfB/RF3D+jbD8= X-Received: by 2002:a65:478d:: with SMTP id e13mr9606833pgs.37.1624624812551; Fri, 25 Jun 2021 05:40:12 -0700 (PDT) MIME-Version: 1.0 References: <20210624121213.3469943-1-hjl.tools@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Fri, 25 Jun 2021 05:39:36 -0700 Message-ID: Subject: [PATCH v2] x86: Check AVX512 without mask instructions To: Uros Bizjak Cc: Hongtao Liu , Hongtao Liu , "gcc-patches@gcc.gnu.org" Content-Type: multipart/mixed; boundary="000000000000d8922a05c59670a8" X-Spam-Status: No, score=-3031.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jun 2021 12:40:15 -0000 --000000000000d8922a05c59670a8 Content-Type: text/plain; charset="UTF-8" On Fri, Jun 25, 2021 at 12:50 AM Uros Bizjak wrote: > > On Fri, Jun 25, 2021 at 4:51 AM Hongtao Liu wrote: > > > > On Fri, Jun 25, 2021 at 12:13 AM Uros Bizjak via Gcc-patches > > wrote: > > > > > > On Thu, Jun 24, 2021 at 2:12 PM H.J. Lu wrote: > > > > > > > > CPUID functions are used to detect CPU features. If vector ISAs > > > > are enabled, compiler is free to use them in these functions. Add > > > > __attribute__ ((target("general-regs-only"))) to CPUID functions > > > > to avoid vector instructions. > > > > > > These functions are intended to be inlined, so how does target > > > attribute affect inlining? > > I guess w/ -O0. they may not be inlined, that's why H.J adds those > > attributes to those functions. > > The problem is not with these functions, but with surrounding checks > for cpuid features. These checks are implemented with logic > instructions, and nothing prevents RA from allocating mask registers, > and consequently mask insn is emitted. Regarding mentioned functions, > cpuid insn pattern has four GPR single-reg constraints, so mask > registers can't be allocated here. > > > pr96814.dump: > > 0804aa40
: > > 804aa40: 8d 4c 24 04 lea 0x4(%esp),%ecx > > ... > > 804aa63: 6a 07 push $0x7 > > 804aa65: e8 e0 e7 ff ff call 804924a <__get_cpuid_count> > > > > Also we need to add a target attribute to avx512f_os_support (), and > > that would be enough to fix the AVX512 part. > > > > Moreover, all check functions in below files may also need to deal with: > > adx-check.h > > aes-avx-check.h > > aes-check.h > > amx-check.h > > attr-nocf-check-1a.c > > attr-nocf-check-3a.c > > avx2-check.h > > avx2-vpop-check.h > > avx512bw-check.h > > avx512-check.h > > avx512dq-check.h > > avx512er-check.h > > avx512f-check.h > > avx512vl-check.h > > avx-check.h > > bmi2-check.h > > bmi-check.h > > cf_check-1.c > > cf_check-2.c > > cf_check-3.c > > cf_check-4.c > > cf_check-5.c > > f16c-check.h > > fma4-check.h > > fma-check.h > > isa-check.h > > lzcnt-check.h > > m128-check.h > > m256-check.h > > m512-check.h > > mmx-3dnow-check.h > > mmx-check.h > > pclmul-avx-check.h > > pclmul-check.h > > pr39315-check.c > > rtm-check.h > > sha-check.h > > spellcheck-options-1.c > > spellcheck-options-2.c > > spellcheck-options-3.c > > spellcheck-options-4.c > > spellcheck-options-5.c > > sse2-check.h > > sse3-check.h > > sse4_1-check.h > > sse4_2-check.h > > sse4a-check.h > > sse-check.h > > ssse3-check.h > > stack-check-11.c > > stack-check-12.c > > stack-check-17.c > > stack-check-18.c > > stack-check-19.c > > xop-check.h > > True, but this would just paper over the real problem. Now, it is > expected that the user decorates the function that checks CPUID > features with the target attribute. I'm not sure if this is OK. > > Uros. CPUID functions are used to detect CPU features. If mask instructions are enabled, compiler is free to use them in these functions. Disable AVX512F in AVX512 check with target pragma to avoid mask instructions. OK for master? Thanks. -- H.J. --000000000000d8922a05c59670a8 Content-Type: text/x-patch; charset="US-ASCII"; name="v2-0001-x86-Check-AVX512-without-mask-instructions.patch" Content-Disposition: attachment; filename="v2-0001-x86-Check-AVX512-without-mask-instructions.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_kqcbm4hr0 RnJvbSA3OTYxYzQ0NWYyN2RlM2U4MTNkNzMzMmFmYzE0ZTk4NDRjMDgzMWY3IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiAiSC5KLiBMdSIgPGhqbC50b29sc0BnbWFpbC5jb20+CkRhdGU6 IFRodSwgMjQgSnVuIDIwMjEgMDQ6NDM6NDEgLTA3MDAKU3ViamVjdDogW1BBVENIIHYyXSB4ODY6 IENoZWNrIEFWWDUxMiB3aXRob3V0IG1hc2sgaW5zdHJ1Y3Rpb25zCgpDUFVJRCBmdW5jdGlvbnMg YXJlIHVzZWQgdG8gZGV0ZWN0IENQVSBmZWF0dXJlcy4gIElmIG1hc2sgaW5zdHJ1Y3Rpb25zCmFy ZSBlbmFibGVkLCBjb21waWxlciBpcyBmcmVlIHRvIHVzZSB0aGVtIGluIHRoZXNlIGZ1bmN0aW9u cy4gIERpc2FibGUKQVZYNTEyRiBpbiBBVlg1MTIgY2hlY2sgd2l0aCB0YXJnZXQgcHJhZ21hIHRv IGF2b2lkIG1hc2sgaW5zdHJ1Y3Rpb25zLgoKCVBSIHRhcmdldC8xMDExODUKCSogZ2NjLnRhcmdl dC9pMzg2L2F2eDUxMi1jaGVjay5oOiBEaXNhYmxlIEFWWDUxMkYgaW4gQVZYNTEyIGNoZWNrCgl3 aXRoIHRhcmdldCBwcmFnbWEuCi0tLQogZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L2kzODYvYXZ4 NTEyLWNoZWNrLmggfCA3ICsrKysrKysKIDEgZmlsZSBjaGFuZ2VkLCA3IGluc2VydGlvbnMoKykK CmRpZmYgLS1naXQgYS9nY2MvdGVzdHN1aXRlL2djYy50YXJnZXQvaTM4Ni9hdng1MTItY2hlY2su aCBiL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9pMzg2L2F2eDUxMi1jaGVjay5oCmluZGV4IDBh Mzc3ZGJhMWQ1Li43Y2NmNzMwYzRmMSAxMDA2NDQKLS0tIGEvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFy Z2V0L2kzODYvYXZ4NTEyLWNoZWNrLmgKKysrIGIvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L2kz ODYvYXZ4NTEyLWNoZWNrLmgKQEAgLTEsNSArMSw4IEBACiAjaW5jbHVkZSA8c3RkbGliLmg+Cisj cHJhZ21hIEdDQyBwdXNoX29wdGlvbnMKKyNwcmFnbWEgR0NDIHRhcmdldCAoIm5vLW1teCxuby1z c2UiKQogI2luY2x1ZGUgImNwdWlkLmgiCisjcHJhZ21hIEdDQyBwb3Bfb3B0aW9ucwogI2luY2x1 ZGUgIm01MTItY2hlY2suaCIKICNpbmNsdWRlICJhdng1MTJmLW9zLXN1cHBvcnQuaCIKIApAQCAt MjUsNiArMjgsOSBAQCBkb190ZXN0ICh2b2lkKQogfQogI2VuZGlmCiAKKyNwcmFnbWEgR0NDIHB1 c2hfb3B0aW9ucworI3ByYWdtYSBHQ0MgdGFyZ2V0ICgibm8tbW14LG5vLXNzZSIpCisKIHN0YXRp YyBpbnQKIGNoZWNrX29zeHNhdmUgKHZvaWQpCiB7CkBAIC0xMTAsMyArMTE2LDQgQEAgbWFpbiAo KQogI2VuZGlmCiAgIHJldHVybiAwOwogfQorI3ByYWdtYSBHQ0MgcG9wX29wdGlvbnMKLS0gCjIu MzEuMQoK --000000000000d8922a05c59670a8--