From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb34.google.com (mail-yb1-xb34.google.com [IPv6:2607:f8b0:4864:20::b34]) by sourceware.org (Postfix) with ESMTPS id 681F3385841D for ; Mon, 21 Aug 2023 09:50:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 681F3385841D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb34.google.com with SMTP id 3f1490d57ef6-d6fcffce486so3050220276.3 for ; Mon, 21 Aug 2023 02:50:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692611422; x=1693216222; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=dypHVd3w6PFt7dqdbsB6673eAc05Mtb2/rW6gcZGYRo=; b=RQua+7ExovkV+4xR9LLzakVJt2tZG0Jg6VWuZssIcBiLrtLBvNfAGXiXJ0I8Kdar+x 4Bw3R3XAcWgLzZflO5q5Wen5tkAJgxxYLrLxQw90gzUvt2SYSdX3kQwKWthVs+vEHjvh LbRYIJfKUs1lXzLtVlujFlD4fNMfFFX3Ks6/ocYVC+njbhLRRBt6ektfHMvvP8q2kJK4 Ye1bYG1t0CwWRDf5xt8RObb4GGXSpqqOTzwgtJdU652ttq94mo4FeFuAkQQtYPG0XNM1 r84mG5jwE560R6yO4aahkTTaSDIqnAIMEJ2f+irV5SjZVVt8jZK8se4gVMNO5ZCpPmmy OzXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692611422; x=1693216222; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dypHVd3w6PFt7dqdbsB6673eAc05Mtb2/rW6gcZGYRo=; b=dJl8s1K7xnj0s0fYZde8Y5yeoPHsZaVRORiRZIQ3K3NwArZFYcIZCpjSzSsB7n2jZx RpSQRCNwHwDDkxKnzyKiKgMTQdn7WQO9mgM7iACViFaG9Qb8QWgOUMqa0OprojwVAxJR xAeRv+8vPbhtfqWRau+DSI8B/3I45pkMeLp81aFHWkNP9Wu8NOieQDAHfKq8Mf7DEY1e n+tCgbUfS0Khg0QH2gLoCsf8aLFSAFVRVjNEZK3EYUdSMZUj06R8jKoT3bcVy0Pep1WT 9wajtPA9WvZaBsHbr2Aw8kEiuADsa1CX/FxusHTT9O45da/TJMsYiCLSGTJW/TsXpIJa fhSw== X-Gm-Message-State: AOJu0Yxa0ClaTceu4DAO9t3UJ/c8EpAfsk11QCV0vjLNiMe5YuZKLnn2 kbO0V14/wIT5ZanyVAK82UxRq63eafF1ERkgR3g= X-Google-Smtp-Source: AGHT+IGzD9u/t4dqElOSNwTGl1b4R1lDZmt5pbPsnkt2286AUzoJvLXcMXxU7+V0Uw3UvY80xdzV0Oz1aHf40stKABo= X-Received: by 2002:a25:9243:0:b0:d1c:2b66:30a with SMTP id e3-20020a259243000000b00d1c2b66030amr6412835ybo.8.1692611422677; Mon, 21 Aug 2023 02:50:22 -0700 (PDT) MIME-Version: 1.0 References: <20230808071312.1569559-1-haochen.jiang@intel.com> In-Reply-To: From: Hongtao Liu Date: Mon, 21 Aug 2023 17:50:10 +0800 Message-ID: Subject: Re: Intel AVX10.1 Compiler Design and Support To: Richard Biener Cc: Jakub Jelinek , ZiNgA BuRgA , haochen.jiang@intel.com, gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Aug 21, 2023 at 5:35=E2=80=AFPM Richard Biener wrote: > > On Mon, Aug 21, 2023 at 10:28=E2=80=AFAM Hongtao Liu = wrote: > > > > On Mon, Aug 21, 2023 at 4:09=E2=80=AFPM Jakub Jelinek wrote: > > > > > > On Mon, Aug 21, 2023 at 09:36:16AM +0200, Richard Biener via Gcc-patc= hes wrote: > > > > > On Sun, Aug 20, 2023 at 6:44=E2=80=AFAM ZiNgA BuRgA via Gcc-patch= es > > > > > wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > With the proposed design of these switches, how would I restric= t AVX10.1 > > > > > > to particular AVX-512 subsets? > > > > > We can't, avx10.1 is taken as an indivisible ISA which contains a= ll > > > > > AVX512 related instructions. > > > > > > > > > > > We=E2=80=99ve been taking these cases as bugs (but yes, intrins= ics are still allowed, so in some cases it might prove difficult to guarant= ee this). > > > > > intel sde support avx10.1-256 target which can be used to validat= e the > > > > > binary(if there's invalid 512-bit vector register or 64-bit kmask > > > > > register is used). > > > > > > I don=E2=80=99t see any other way of doing what you want within= the constraints of this design. > > > > > It looks like the requirement is that we want a > > > > > -mavx10-vector-width=3D256(or maybe reuse -mprefer-vector-width= =3D256) > > > > > option that acts on the original -mavx512XXX option to produce > > > > > avx10.1-256 compatible binary. we can't use -mavx10.1-256 since i= t may > > > > > include avx512fp16 directives and thus not be backward compatible > > > > > SKX/CLX/ICX. > > > > > > > > Yes. Note we cannot really re-purpose -mprefer-vector-width=3D256 = since that > > > > would also make uses of 512bit intrinsics ill-formed. So we'd need= a new > > > > flag that would restrict AVX512VL to 256bit, possibly using a commo= n internal > > > > flag for this and the -mavx10.1-256 vector size effect. > > > > > > > > Maybe -mdisable-vector-width-512 or -mavx512vl-for-avx10.1-256 or > > > > -mavx512vl-256? Writing these the last looks most sensible to me? > > > > Note it should combine with -mavx512vl to -mavx512vl-256 to make > > > > -march=3Dnative -mavx512vl-256 work (I think we should also allow t= he > > > > flag together with -mavx10.1*?) > > > > > > > > mavx512vl-256 > > > > Target ... > > > > Disable the 512bit vector ISA subset of AVX512 or AVX10, enable > > > > the 256bit vector ISA subset of AVX512. > > > > > > Wouldn't it be better to have it similarly to other ISA options as so= mething > > > positive, say -mevex512 (the ISA docs talk about EVEX.512, EVEX.256 a= nd > > > EVEX.128)? > > > Have -mavx512f (and anything that implies it right now) imply also -m= evex512 > > > but allow -mno-evex512 which wouldn't unset everything dependent on > > > -mavx512f. There is one gotcha, if -mavx512vl isn't enabled in the e= nd, > > > then -mavx512f -mno-evex512 should disable whole TARGET_AVX512F becau= se > > > nothing is left. > > > TARGET_EVEX512 then would guard all TARGET_AVX512* intrinsics which o= perate > > > on 512-bit vector registers or 64-bit mask registers (in addition to = the > > > other TARGET_AVX512* options, perhaps except TARGET_AVX512F), whether= the > > > 512-bit modes can be used etc. > > We have an undocumented option mavx10-max-512bit. > > > > 1314;; Only for implementation use > > 1315mavx10-max-512bit > > 1316Target Mask(ISA2_AVX10_512BIT) Var(ix86_isa_flags2) Undocumented Sa= ve > > 1317Indicates 512 bit vector width support for AVX10. > > Ah, missed that, but ... > > > Currently it's only used for AVX10 only, maybe we can extend it to > > existing AVX512*** FLAGS. > > so users can use -mavx512XXX -mno-avx10-max-512bit to get avx10.1-256 > > compatible binaries. > > ... -mno-avx10-max-512bit sounds awkward, no-..-max implies the max doesn= 't > apply, so what is it then? > > If you think -mavx512vl-256 isn't good then maybe -mavx-width-512 > and -mno-avx-width-512 would be better (applying to both avx512 and avx10= ). > I chose -mavx512vl-256 because of the existing -mavx10.1-256. Btw, > will we then have -mavx10.2-256 as well? Do we allow -mavx10.1-512 > -mavx10.2-256 then, thus just enable 256bit for 10.2 extensions to 10.1?! We're only allowing a single vector width. -mavx10.1-512 mavx10.2-256 will only enable -mavx10.2-256 + -mavx10.1-256. > I think we opened up too many holes here and the options should be fixed > to decouple the size from the base ISA. I see, we can try to use -mavx-max-512bit(maybe another name) to decouple the size from the base ISA. And make -mavx10.1-256 just implies all -mavx512XXX + -mno-avx-max-512bit, -mavx10.1-512 implies -mavx512XXX + mavx-max-512bit. then -mavx512vl-256 is just equal to -mavx512vl + mno-avx-max-512bit. Lots of work to do, but still not too late for GCC14.1 > > What variable we map this to internally doesn't really matter but yes, > we'd need to guard 512bit patterns with (AVX512VL || AVX10) && 512-enable= d-flag > > Richard. > > > From the implementation perspective, we need to restrict all 512-bit > > vector patterns/builtins/intrinsics under both AVX512XXX and > > TARGET_AVX10_512BIT. > > similar for register allocation, parameter passing, return value, > > vector_mode_supported_p, gather/scatter hook, and all other hooks. > > After that, the -mavx10-max-512bit will divide existing AVX512 into 2 > > parts, AVX512XXX-256, AVX512XXX-512. > > > > > > > > > > Jakub > > > > > > > > > -- > > BR, > > Hongtao --=20 BR, Hongtao