From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=o1xa=EG=gmail.com=crazylht@sourceware.org>
Received: from mail-yb1-xb34.google.com (mail-yb1-xb34.google.com [IPv6:2607:f8b0:4864:20::b34])
	by sourceware.org (Postfix) with ESMTPS id 681F3385841D
	for <gcc-patches@gcc.gnu.org>; Mon, 21 Aug 2023 09:50:23 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 681F3385841D
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-yb1-xb34.google.com with SMTP id 3f1490d57ef6-d6fcffce486so3050220276.3
        for <gcc-patches@gcc.gnu.org>; Mon, 21 Aug 2023 02:50:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1692611422; x=1693216222;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=dypHVd3w6PFt7dqdbsB6673eAc05Mtb2/rW6gcZGYRo=;
        b=RQua+7ExovkV+4xR9LLzakVJt2tZG0Jg6VWuZssIcBiLrtLBvNfAGXiXJ0I8Kdar+x
         4Bw3R3XAcWgLzZflO5q5Wen5tkAJgxxYLrLxQw90gzUvt2SYSdX3kQwKWthVs+vEHjvh
         LbRYIJfKUs1lXzLtVlujFlD4fNMfFFX3Ks6/ocYVC+njbhLRRBt6ektfHMvvP8q2kJK4
         Ye1bYG1t0CwWRDf5xt8RObb4GGXSpqqOTzwgtJdU652ttq94mo4FeFuAkQQtYPG0XNM1
         r84mG5jwE560R6yO4aahkTTaSDIqnAIMEJ2f+irV5SjZVVt8jZK8se4gVMNO5ZCpPmmy
         OzXg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1692611422; x=1693216222;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=dypHVd3w6PFt7dqdbsB6673eAc05Mtb2/rW6gcZGYRo=;
        b=dJl8s1K7xnj0s0fYZde8Y5yeoPHsZaVRORiRZIQ3K3NwArZFYcIZCpjSzSsB7n2jZx
         RpSQRCNwHwDDkxKnzyKiKgMTQdn7WQO9mgM7iACViFaG9Qb8QWgOUMqa0OprojwVAxJR
         xAeRv+8vPbhtfqWRau+DSI8B/3I45pkMeLp81aFHWkNP9Wu8NOieQDAHfKq8Mf7DEY1e
         n+tCgbUfS0Khg0QH2gLoCsf8aLFSAFVRVjNEZK3EYUdSMZUj06R8jKoT3bcVy0Pep1WT
         9wajtPA9WvZaBsHbr2Aw8kEiuADsa1CX/FxusHTT9O45da/TJMsYiCLSGTJW/TsXpIJa
         fhSw==
X-Gm-Message-State: AOJu0Yxa0ClaTceu4DAO9t3UJ/c8EpAfsk11QCV0vjLNiMe5YuZKLnn2
	kbO0V14/wIT5ZanyVAK82UxRq63eafF1ERkgR3g=
X-Google-Smtp-Source: AGHT+IGzD9u/t4dqElOSNwTGl1b4R1lDZmt5pbPsnkt2286AUzoJvLXcMXxU7+V0Uw3UvY80xdzV0Oz1aHf40stKABo=
X-Received: by 2002:a25:9243:0:b0:d1c:2b66:30a with SMTP id
 e3-20020a259243000000b00d1c2b66030amr6412835ybo.8.1692611422677; Mon, 21 Aug
 2023 02:50:22 -0700 (PDT)
MIME-Version: 1.0
References: <20230808071312.1569559-1-haochen.jiang@intel.com>
 <OS3P286MB058470EBCDAA49D4F533745BD718A@OS3P286MB0584.JPNP286.PROD.OUTLOOK.COM>
 <CAMZc-bxJTxuPL0yF8JjQf_g1LyAmseq_Rt3Xw_-LvR1GvC-nPw@mail.gmail.com>
 <CAFiYyc13Z_rL7ENTVT82jW_7oU6CN0RoY+fLmgApG4LWKNTOQg@mail.gmail.com>
 <ZOMbnj/k9+EEKlU/@tucnak> <CAMZc-bwFJCYvYtBw0n632bbao_WnbqO3md+Zx38YLTgO56WjTw@mail.gmail.com>
 <CAFiYyc0mqgc4MMN5-dzAxRvqgx45=P-=HdH2_xZVtX_qz2GDKA@mail.gmail.com>
In-Reply-To: <CAFiYyc0mqgc4MMN5-dzAxRvqgx45=P-=HdH2_xZVtX_qz2GDKA@mail.gmail.com>
From: Hongtao Liu <crazylht@gmail.com>
Date: Mon, 21 Aug 2023 17:50:10 +0800
Message-ID: <CAMZc-bwWUnDqKkJGNyvJb2p4Mq4+=XHdq7vj-34Hfziv1G0z_g@mail.gmail.com>
Subject: Re: Intel AVX10.1 Compiler Design and Support
To: Richard Biener <richard.guenther@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>, ZiNgA BuRgA <zingaburga@hotmail.com>, haochen.jiang@intel.com, 
	gcc-patches@gcc.gnu.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Mon, Aug 21, 2023 at 5:35=E2=80=AFPM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Mon, Aug 21, 2023 at 10:28=E2=80=AFAM Hongtao Liu <crazylht@gmail.com>=
 wrote:
> >
> > On Mon, Aug 21, 2023 at 4:09=E2=80=AFPM Jakub Jelinek <jakub@redhat.com=
> wrote:
> > >
> > > On Mon, Aug 21, 2023 at 09:36:16AM +0200, Richard Biener via Gcc-patc=
hes wrote:
> > > > > On Sun, Aug 20, 2023 at 6:44=E2=80=AFAM ZiNgA BuRgA via Gcc-patch=
es
> > > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > With the proposed design of these switches, how would I restric=
t AVX10.1
> > > > > > to particular AVX-512 subsets?
> > > > > We can't, avx10.1 is taken as an indivisible ISA which contains a=
ll
> > > > > AVX512 related instructions.
> > > > >
> > > > > > We=E2=80=99ve been taking these cases as bugs (but yes, intrins=
ics are still allowed, so in some cases it might prove difficult to guarant=
ee this).
> > > > > intel sde support avx10.1-256 target which can be used to validat=
e the
> > > > > binary(if there's invalid 512-bit vector register or 64-bit kmask
> > > > > register is used).
> > > > > > I don=E2=80=99t see any other way of doing what you want within=
 the constraints of this design.
> > > > > It looks like the requirement is that we want a
> > > > > -mavx10-vector-width=3D256(or maybe reuse -mprefer-vector-width=
=3D256)
> > > > > option that acts on the original -mavx512XXX option to produce
> > > > > avx10.1-256 compatible binary. we can't use -mavx10.1-256 since i=
t may
> > > > > include avx512fp16 directives and thus not be backward compatible
> > > > > SKX/CLX/ICX.
> > > >
> > > > Yes.  Note we cannot really re-purpose -mprefer-vector-width=3D256 =
since that
> > > > would also make uses of 512bit intrinsics ill-formed.  So we'd need=
 a new
> > > > flag that would restrict AVX512VL to 256bit, possibly using a commo=
n internal
> > > > flag for this and the -mavx10.1-256 vector size effect.
> > > >
> > > > Maybe -mdisable-vector-width-512 or -mavx512vl-for-avx10.1-256 or
> > > > -mavx512vl-256?  Writing these the last looks most sensible to me?
> > > > Note it should combine with -mavx512vl to -mavx512vl-256 to make
> > > > -march=3Dnative -mavx512vl-256 work (I think we should also allow t=
he
> > > > flag together with -mavx10.1*?)
> > > >
> > > > mavx512vl-256
> > > > Target ...
> > > > Disable the 512bit vector ISA subset of AVX512 or AVX10, enable
> > > > the 256bit vector ISA subset of AVX512.
> > >
> > > Wouldn't it be better to have it similarly to other ISA options as so=
mething
> > > positive, say -mevex512 (the ISA docs talk about EVEX.512, EVEX.256 a=
nd
> > > EVEX.128)?
> > > Have -mavx512f (and anything that implies it right now) imply also -m=
evex512
> > > but allow -mno-evex512 which wouldn't unset everything dependent on
> > > -mavx512f.  There is one gotcha, if -mavx512vl isn't enabled in the e=
nd,
> > > then -mavx512f -mno-evex512 should disable whole TARGET_AVX512F becau=
se
> > > nothing is left.
> > > TARGET_EVEX512 then would guard all TARGET_AVX512* intrinsics which o=
perate
> > > on 512-bit vector registers or 64-bit mask registers (in addition to =
the
> > > other TARGET_AVX512* options, perhaps except TARGET_AVX512F), whether=
 the
> > > 512-bit modes can be used etc.
> > We have an undocumented option mavx10-max-512bit.
> >
> > 1314;; Only for implementation use
> > 1315mavx10-max-512bit
> > 1316Target Mask(ISA2_AVX10_512BIT) Var(ix86_isa_flags2) Undocumented Sa=
ve
> > 1317Indicates 512 bit vector width support for AVX10.
>
> Ah, missed that, but ...
>
> > Currently it's only used for AVX10 only, maybe we can extend it to
> > existing AVX512*** FLAGS.
> > so users can use -mavx512XXX -mno-avx10-max-512bit to get avx10.1-256
> > compatible binaries.
>
> ... -mno-avx10-max-512bit sounds awkward, no-..-max implies the max doesn=
't
> apply, so what is it then?
>
> If you think -mavx512vl-256 isn't good then maybe -mavx-width-512
> and -mno-avx-width-512 would be better (applying to both avx512 and avx10=
).
> I chose -mavx512vl-256 because of the existing -mavx10.1-256.  Btw,
> will we then have -mavx10.2-256 as well?  Do we allow -mavx10.1-512
> -mavx10.2-256 then, thus just enable 256bit for 10.2 extensions to 10.1?!
We're only allowing a single vector width.
-mavx10.1-512 mavx10.2-256 will only enable -mavx10.2-256 + -mavx10.1-256.
> I think we opened up too many holes here and the options should be fixed
> to decouple the size from the base ISA.
I see, we can try to use -mavx-max-512bit(maybe another name) to
decouple the size from the base ISA.
And make
 -mavx10.1-256 just implies all -mavx512XXX + -mno-avx-max-512bit,
 -mavx10.1-512 implies -mavx512XXX + mavx-max-512bit.
then -mavx512vl-256 is just equal to -mavx512vl + mno-avx-max-512bit.

Lots of work to do, but still not too late for GCC14.1
>
> What variable we map this to internally doesn't really matter but yes,
> we'd need to guard 512bit patterns with (AVX512VL || AVX10) && 512-enable=
d-flag
>
> Richard.
>
> > From the implementation perspective, we need to restrict all 512-bit
> > vector patterns/builtins/intrinsics under both AVX512XXX and
> > TARGET_AVX10_512BIT.
> > similar for register allocation, parameter passing, return value,
> > vector_mode_supported_p, gather/scatter hook, and all other hooks.
> > After that, the -mavx10-max-512bit will divide existing AVX512 into 2
> > parts, AVX512XXX-256, AVX512XXX-512.
> >
> >
> > >
> > >         Jakub
> > >
> >
> >
> > --
> > BR,
> > Hongtao


--=20
BR,
Hongtao