From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb2a.google.com (mail-yb1-xb2a.google.com [IPv6:2607:f8b0:4864:20::b2a]) by sourceware.org (Postfix) with ESMTPS id A39363858CD1 for ; Tue, 14 Nov 2023 02:40:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A39363858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A39363858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::b2a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699929640; cv=none; b=Q33tw/wI4A7Gw2g/KUiYMJdoEi/uWfFWw2ZCNXD0+8QF9CCdGhkdYWQfwK/uXZECcOYtNFqloAoWe5n+fkFi1Qvd8U5+Ggu9S8v5AvFAL6Q8/2uEE1DFYAZAnB1nIzMqbFIMRQA5ynqQ00/dryiOBkW8cdRjV1TsslLb+t3c++w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699929640; c=relaxed/simple; bh=tKAlLkK1TNWcYP+3iW41oSTXie0fzC7kPhYGRIpLsaQ=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=CRnMTUvrD9P+ej9/146WSTfHnJXkhMRvSjYflVnv2HcBUoiClAe09v2PfzSv9+EHGaDy6qmPyaBHD13L8/59NB7ApsVCRfdXF8z523NQa5ZRxj9z+ir7ZCPXJpT2TuWTPNhk8wawiaPBloSdtfC1+qrXHRWeoGxwSlHl5Wv1Fj4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yb1-xb2a.google.com with SMTP id 3f1490d57ef6-d9fe0a598d8so5141347276.2 for ; Mon, 13 Nov 2023 18:40:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699929637; x=1700534437; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=zJifrT9WdG3N4iTUn+MqfwQw5CW4uoXapD7knoYLxn0=; b=mtL4OGEAzAVZudfGDt0RKh33BVd/zJn3MzOD5746LLwpVQMfUylrHoCV7QRIHYmFDH JLeLzNMqPfpFqQyJN/9TNiyKDx+vKqxMOf5tLuBzaMM5qAharZV0GKITNV8O1fs8LV5M CvFE3sZsZgEMIWGhThRtRJNPhy/tljjB8E6UFh5uirKzqWHqbduWqYXMkknfdpjpLKLp EA+D2hFkp9mNHyoVrX+Q6eE9LTjaRMVong8dIBa4eJK921I0R8VYOTB/xwRD8bxCFNUG vax59dTwJee37V9XNA7CPKHrZrdxiFQFEzoKSYKVyvLxWy/TCz/S0+iJ7qWitCW2d5zA GKRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699929637; x=1700534437; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zJifrT9WdG3N4iTUn+MqfwQw5CW4uoXapD7knoYLxn0=; b=l65r+10e5lvK6hmS5ElvVIsHCGRB/Jk0AZprO3LANlOeIwnsxFGQMHfc9lNGi49G9v cJKG2MQ0akn93TlpDuDO0vAeh6wSrYeLnF8QDfMSue7OhKXSwWXAAOU1UfjmGBI1V12b FLVRRuTccCQ5FS/Cq/kbGBDVYy6Rh2RiLFLOGFOczlI5hO8elOGqE6No73U9Tmiepzo4 Q4cQvpc1HzlcMqatsQOXKGaeoUzgOJUhcTnJW9k719MVo4bTaHTjzs+mJNZKLM9OPcOW n8BWHo7VpiW5k8Vpg9Y3+Xs5UqvehyPv0iNfWY7VfmmLDgh0wHM3kXx0GZzknvwctCq1 O3dg== X-Gm-Message-State: AOJu0YyNGjwIb9xU9j0Tn6f+Qas+nQLOEj1ouUxmw1SlZi3Ez9Zu0ckH qSbFt+cllzcUBAE6R49EmpfYa9HLAuKzR8kgqQI= X-Google-Smtp-Source: AGHT+IHb5vY6ROIZaZzWYS8ukjnmCGJxwG+OLb5p3PvSYUS7BVQMdVc656iojBVAW3tvXGi1bbyq7pNzy1FK5xoNR60= X-Received: by 2002:a25:14c1:0:b0:d9a:64ca:8fbc with SMTP id 184-20020a2514c1000000b00d9a64ca8fbcmr7299384ybu.46.1699929636862; Mon, 13 Nov 2023 18:40:36 -0800 (PST) MIME-Version: 1.0 References: <20231110014158.371690-1-haochen.jiang@intel.com> In-Reply-To: From: Hongtao Liu Date: Tue, 14 Nov 2023 10:40:24 +0800 Message-ID: Subject: Re: [RFC] Intel AVX10.1 Compiler Design and Support To: Richard Biener Cc: Haochen Jiang , gcc-patches@gcc.gnu.org, hongtao.liu@intel.com, ubizjak@gmail.com, Florian Weimer Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Nov 13, 2023 at 7:25=E2=80=AFPM Richard Biener wrote: > > On Mon, Nov 13, 2023 at 7:58=E2=80=AFAM Hongtao Liu = wrote: > > > > On Fri, Nov 10, 2023 at 6:15=E2=80=AFPM Richard Biener > > wrote: > > > > > > On Fri, Nov 10, 2023 at 2:42=E2=80=AFAM Haochen Jiang wrote: > > > > > > > > Hi all, > > > > > > > > This RFC patch aims to add AVX10.1 options. After we added -m[no-]e= vex512 > > > > support, it makes a lot easier to add them comparing to the August = version. > > > > Detail for AVX10 is shown below: > > > > > > > > Intel Advanced Vector Extensions 10 (Intel AVX10) Architecture Spec= ification > > > > It describes the Intel Advanced Vector Extensions 10 Instruction Se= t > > > > Architecture. > > > > https://cdrdv2.intel.com/v1/dl/getContent/784267 > > > > > > > > The Converged Vector ISA: Intel Advanced Vector Extensions 10 Techn= ical Paper > > > > It provides introductory information regarding the converged vector= ISA: Intel > > > > Advanced Vector Extensions 10. > > > > https://cdrdv2.intel.com/v1/dl/getContent/784343 > > > > > > > > Our proposal is to take AVX10.1-256 and AVX10.1-512 as two "virtual= " ISAs in > > > > the compiler. AVX10.1-512 will imply AVX10.1-256. They will not ena= ble > > > > anything at first. At the end of the option handling, we will check= whether > > > > the two bits are set. If AVX10.1-256 is set, we will set the AVX512= related > > > > ISA bits. AVX10.1-512 will further set EVEX512 ISA bit. > > > > > > > > It means that AVX10 options will be separated from the existing AVX= 512 and the > > > > newly added -m[no-]evex512 options. AVX10 and AVX512 options will c= ontrol > > > > (enable/disable/set vector size) the AVX512 features underneath ind= ependently. > > > > If there=E2=80=99s potential overlap or conflict between AVX10 and = AVX512 options, > > > > some rules are provided to define the behavior, which will be descr= ibed below. > > > > > > > > avx10.1 option will be provided as an alias of avx10.1-256. > > > > > > > > In the future, the AVX10 options will imply like this: > > > > > > > > AVX10.1-256 <---- AVX10.1-512 > > > > ^ ^ > > > > | | > > > > > > > > AVX10.2-256 <---- AVX10.2-512 > > > > ^ ^ > > > > | | > > > > > > > > AVX10.3-256 <---- AVX10.3-512 > > > > ^ ^ > > > > | | > > > > > > > > Each of them will have its own option to enable/disabled correspond= ing > > > > features. The alias avx10.x will also be provided. > > > > > > > > As mentioned in August version RFC, since we lean towards the adopt= ion of > > > > AVX10 instead of AVX512 from now on, we don=E2=80=99t recommend use= rs to combine the > > > > AVX10 and legacy AVX512 options. > > > > > > I wonder whether adoption could be made easier by also providing a > > > -mavx10[.0] level that removes some of the more obscure sub-ISA requi= rements > > > to cover more existing implementations (I'd not add -mavx10.0-512 her= e). > > > I'd require only skylake-AVX512 features here, basically all non-KNL = AVX512 > > > CPUs should have a "virtual" AVX10 level that allows to use that feat= ure set, > > We have -mno-evex512 can cover those cases, so what you want is like a > > simple alias of "-march=3Dskylake-avx512 -mno-evex512"? > > For the AVX512 enabled sub-isas of skylake-avx512 yes I guess. > > > > restricted to 256bits so future AVX10-256 implementations can handle = it > > > as well as all existing (and relevant, which excludes KNL) AVX512 > > > implementations. > > > > > > Otherwise AVX10 is really a hard sell (as AVX512 was originally). > > It's a rebranding of the existing AVX512 to AVX10, AVX10.0 just > > complicated things further(considering we already have x86-64-v4 which > > is different from skylake-avx512). > > Well, the cut-off for "AVX512" is quite arbitrary. Introducing a > "new" ISA that's > only available in HW available in the future and suggesting users to embr= ace > that already (like Intel did with AVX512 without offering client SKU supp= ort) > is a hard sell. > > I realize Intel thinks client SKU support for AVX10 (restricted to 256bit= ) will > be "easier". But then don't expect anybody to adopt that in the next 10 = years. > > Just to add - we were suggesting to use x86_64-v3 for the "next" enterpri= se > product but got downvoted to x86_64-v2 for compatibility reasons. > > If it were possible I'd axe x86_64-v4. Maybe we should add a x86_64-v3.5 > that sits inbetween v3 and v4, offering AVX512 but restricted to 256bit > (and obviously not requiring more of the AVX512 features that v4 requires= ). About the arch level is indeed a problem, especially since the default size of avx10 is 256. +Florian Weimer for more inputs. > > Richard. > > > > > > > > However, we would like to introduce some > > > > simple rules for user when it comes to combination. > > > > > > > > 1. Enabling AVX10 and AVX512 at the same command line with differen= t vector > > > > size will lead to a warning message. The behavior of the compiler w= ill be > > > > enabling AVX10 with longer, i.e., 512 bit vector size. > > > > > > > > If the vector sizes are the same (e.g. -mavx10.1-256 -mavx512f -mno= -evex512, > > > > -mavx10.1-512 -mavx512f), it will be valid with the corresponding v= ector size. > > > > > > > > 2. -mno-avx10.1 option can=E2=80=99t disable any features enabled b= y AVX512 options or > > > > impact the vector size, and vice versa. The compiler will emit warn= ings if > > > > necessary. > > > > > > > > For the auto dispatch support including function multi versioning, = function > > > > attribute usage, the behavior will be identical to compiler options= . > > > > > > > > If you have any questions, feel free to ask in this thread. > > > > > > > > Thx, > > > > Haochen > > > > > > > > > > > > > > > > -- > > BR, > > Hongtao --=20 BR, Hongtao