From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vs1-xe2c.google.com (mail-vs1-xe2c.google.com [IPv6:2607:f8b0:4864:20::e2c]) by sourceware.org (Postfix) with ESMTPS id 7AF4F3858C1F for ; Thu, 15 Jun 2023 07:45:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7AF4F3858C1F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-vs1-xe2c.google.com with SMTP id ada2fe7eead31-43f519c0888so563607137.3 for ; Thu, 15 Jun 2023 00:45:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686815123; x=1689407123; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Fc177UTubCNqonytoyrG1k4nctMeDj8QFbt2e17cddo=; b=F7sWzz5hiKRVSJ4a+/pOBHz7wuNvx3JWTPATMmwkLZ0xDPdJf8eaF561VnrJsA+XFN CT1MMaAfgBx5ra8BbM4CcMZ/qrdieHddajdD0ODGBbk065wMl5S/0dXOOc56mqc0IqyL ErB5uB16abBKHTZcAKI7II5un8hQDnQuzmoCpXaSucePaRJuk/ZsYw3Ud01/Enqxm6Cj TCxBHG8cSWkJP15g0/7JIRxrWU0LBo0nKSsmvLm33Q6Vh8jeAMn8UDzMDXye+R+d/xSl xoCgmfx110eFktVr7sy9ticquOhxFXYs0eO9Zcg7cxN9OCCf38kd3jDeLJG5j6B2H+LV kIvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686815123; x=1689407123; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Fc177UTubCNqonytoyrG1k4nctMeDj8QFbt2e17cddo=; b=Ul3fbjwOdGnW5KfKGzk82fTekhdpwSSsIKr655hf0Z0LlI71aGZqM6nSamsAOBvwvH rcnNxTM/ZBC086minkUppMdY3uo1Lj5tdalHenDAV0HaRztWFPhCLaiLmTEO3VevPnwp vvgh43NyuTk1ctWACStvvCNrPkyesdKaaaGSk10WfKMPsHVDyt8GwCy4JIvk2PfK8fcl 8I9UrJsTiCFF6bw+eHTgdVuzfC0XsSMzchY8ZZ2HVOQW9wVOVTDEOTIWl0BNI9bzdbMQ zROYthfivNA7zRue9ChxDBQbYR9ot6dNjrYAIcZ6lBaAtWXnbFS7RF773euyIp4hI5lw oB/Q== X-Gm-Message-State: AC+VfDxilI2rr28RjzFSJG6p0Nr4WtiHnySluRoaL3w4F1drldXXynPn OXpLDRyTKbZy7FBn85bpLIqFMlTJ8Y9h4A6TTEk= X-Google-Smtp-Source: ACHHUZ5wI9wi6Pq8h0riKSdivZ3qpInPItXBSBGnJa3Z84l9KvnO1ZhaURFl4t4xIM1bUqGZzFGQ+F5I/etbQhsnxm8= X-Received: by 2002:a67:e3a5:0:b0:434:6958:cdc6 with SMTP id j5-20020a67e3a5000000b004346958cdc6mr9020270vsm.19.1686815122807; Thu, 15 Jun 2023 00:45:22 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Hongtao Liu Date: Thu, 15 Jun 2023 15:45:11 +0800 Message-ID: Subject: Re: [PATCH] x86: correct and improve "*vec_dupv2di" To: Uros Bizjak Cc: Jan Beulich , "gcc-patches@gcc.gnu.org" , Hongtao Liu , Kirill Yukhin Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Jun 15, 2023 at 3:07=E2=80=AFPM Uros Bizjak via Gcc-patches wrote: > > On Thu, Jun 15, 2023 at 8:03=E2=80=AFAM Jan Beulich via Gcc-patches > wrote: > > > > The input constraint for the %vmovddup alternative was wrong, as the > > upper 16 XMM registers require AVX512VL to be used with this insn. To > > compensate, introduce a new alternative permitting all 32 registers, by > > broadcasting to the full 512 bits in that case if AVX512VL is not > > available. > > > > gcc/ > > > > * config/i386/sse.md (vec_dupv2di): Correct %vmovddup input > > constraint. Add new AVX512F alternative. > > --- > > Strictly speaking the new alternative could be enabled from AVX2 > > onwards, but vmovddup can frequently be a shorter encoding (VEX2 > > vs VEX3). > > > > --- a/gcc/config/i386/sse.md > > +++ b/gcc/config/i386/sse.md > > @@ -25851,19 +25851,39 @@ > > (symbol_ref "true")))]) > > > > (define_insn "*vec_dupv2di" > > - [(set (match_operand:V2DI 0 "register_operand" "=3Dx,v,v,x") > > + [(set (match_operand:V2DI 0 "register_operand" "=3Dx,v,v,v,x") > > (vec_duplicate:V2DI > > - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] > > + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,Yvm,0"))= )] > > "TARGET_SSE" > > - "@ > > - punpcklqdq\t%0, %0 > > - vpunpcklqdq\t{%d1, %0|%0, %d1} > > - %vmovddup\t{%1, %0|%0, %1} > > - movlhps\t%0, %0" > > - [(set_attr "isa" "sse2_noavx,avx,sse3,noavx") > > - (set_attr "type" "sselog1,sselog1,sselog1,ssemov") > > - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig") > > - (set_attr "mode" "TI,TI,DF,V4SF")]) > > +{ > > + switch (which_alternative) > > + { > > + case 0: > > + return "punpcklqdq\t%0, %0"; > > + case 1: > > + return "vpunpcklqdq\t{%d1, %0|%0, %d1}"; > > + case 2: > > + if (TARGET_AVX512VL) > > + return "vpbroadcastq\t{%1, %0|%0, %1}"; > > + return "vpbroadcastq\t{%1, %g0|%g0, %1}"; > > You can use > > * return TARGET_AVX512VL ? \"vpbroadcastq\t{%1, %0|%0, %1}\" : > \"vpbroadcastq\t{%1, %g0|%g0, %1}\"; > > directly in a multi-output insn template to avoid the above C code. > See e.g. sse2_cvtpd2pi for an example. > > Uros. > > > + case 3: > > + return "%vmovddup\t{%1, %0|%0, %1}"; > > + case 4: > > + return "movlhps\t%0, %0"; > > + default: > > + gcc_unreachable (); > > + } > > +} > > + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") > > + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") > > + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") > > + (set_attr "mode" "TI,TI,TI,DF,V4SF") alternative 2 should be XImode when !TARGET_AVX512VL. > > + (set (attr "enabled") > > + (if_then_else > > + (eq_attr "alternative" "2") > > + (symbol_ref "TARGET_AVX512VL > > + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") > > + (const_string "*")))]) > > > > (define_insn "avx2_vbroadcasti128_" > > [(set (match_operand:VI_256 0 "register_operand" "=3Dx,v,v") --=20 BR, Hongtao