From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk1-xa35.google.com (mail-vk1-xa35.google.com [IPv6:2607:f8b0:4864:20::a35]) by sourceware.org (Postfix) with ESMTPS id 5DDEA385742F for ; Mon, 13 Sep 2021 15:09:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5DDEA385742F Received: by mail-vk1-xa35.google.com with SMTP id 13so3531446vkl.1 for ; Mon, 13 Sep 2021 08:09:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=up7kH1Rx41/8PNATIAbJHA/ZEluO0FV7wj2oAvmswjM=; b=7WDIe/NEDngHrR5bBIHjgtklngI1erZbpQvFoaH7InuwyudHcETx+2mKxxcfw8U5TK HiLEbg5JmQTP4iwCv9juD0whua3eVt8/uwJ36f6xt1G1lfrwHCsqk1MkUS2OPW0FzJDc StF/I7uMDQtx7WL6ghy0y4um4Ew49m0opbRTaAE8VOVw0qKZO5jHRaKOmTQLkrpYrRKr ubiM9wUJOtZuSDgHmmUuItIrBgWnI3Unu8sUQPjoqZjWEit03S5A10YXZk/yfp2QW9QJ C0Xdmg9XVGfdrwSE3spLPXD9SkveWUsTKT+ZviN8aC7GDWX3xjKt4cwmHUA75b67iODx snlw== X-Gm-Message-State: AOAM530ERmldN/hhXtRpKw+AoBvSfKnzz1FZKjXouLodRj0hA9m7obZT u9Pyum/U+64OFdMWphItKq5VK1PNxxID5ljtLSI= X-Google-Smtp-Source: ABdhPJy008YiF4P7s+dg9H8NdU2FE0HOIkKYf32MEZJko5nsKfyfdSFSa6eChwIt9XsoaSqklTBnP4bYdxvy7Zal/Ig= X-Received: by 2002:a1f:d943:: with SMTP id q64mr4832380vkg.23.1631545759904; Mon, 13 Sep 2021 08:09:19 -0700 (PDT) MIME-Version: 1.0 References: <20210910043632.1087666-1-hongtao.liu@intel.com> <26b4306a-b939-becc-3ae8-e5525442f94d@gmail.com> In-Reply-To: <26b4306a-b939-becc-3ae8-e5525442f94d@gmail.com> From: Hongtao Liu Date: Mon, 13 Sep 2021 23:09:08 +0800 Message-ID: Subject: Re: [PATCH] Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode. To: Jeff Law Cc: liuhongt , GCC Patches , rdsandiford@googlemail.com, Segher Boessenkool Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Sep 2021 15:09:21 -0000 On Mon, Sep 13, 2021 at 10:10 PM Jeff Law via Gcc-patches wrote: > > > > On 9/9/2021 10:36 PM, liuhongt via Gcc-patches wrote: > > Currently for (vec_concat:M (vec_select op0 idx1)(vec_select op0 idx2)), > > optimizer wouldn't simplify if op0 has different mode with M, but that's too > > restrict which will prevent below optimization, the condition can be relaxed > > to op0 must have same inner mode with M. > > > > (set (reg:V2DF 87 [ xx ]) > > (vec_concat:V2DF (vec_select:DF (reg:V4DF 92) > > (parallel [ > > (const_int 2 [0x2]) > > ])) > > (vec_select:DF (reg:V4DF 92) > > (parallel [ > > (const_int 3 [0x3]) > > ])))) > > > > Bootsrapped and regtested on x86_64-linux-gnu{-m32,}. > > Ok for trunk? > > > > gcc/ChangeLog: > > > > * simplify-rtx.c > > (simplify_context::simplify_binary_operation_1): Relax > > condition of simplifying (vec_concat:M (vec_select op0 > > index0)(vec_select op1 index1)) to allow different modes > > between op0 and M, but have same inner mode. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/vect-rebuild.c: > > * gcc.target/i386/avx512f-vect-rebuild.c: New test. > Funny, I was looking at something rather similar recently, but never > pushed on it because we were going to need too many entries in the > parallel selector. > > I'm not convinced that we need the inner mode to match anything. As > long as the vec_concat's mode is twice the size of the vec_select modes > and the vec_select mode is <= the mode of its operands ISTM this is > fine. We might want the modes of the vec_select to match, but I don't > think that's strictly necessary either, they just need to be the same > size. ie, we could have somethig like > > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI))) > > I'm not sure if that level of generality is useful though. If we want > the modes of the vec_selects to match I think we could still support > > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF))) The first operand of vec_select is required to be the same here. and I guess (vec_select:DF (subreg:V4DF (reg:V8DF) 0) will be simplified to (vec_select:DF (reg:V8DF))? > > Thoughts? > > jeff > > Jeff > > > > --- > > gcc/simplify-rtx.c | 3 ++- > > .../gcc.target/i386/avx512f-vect-rebuild.c | 21 +++++++++++++++++++ > > gcc/testsuite/gcc.target/i386/vect-rebuild.c | 2 +- > > 3 files changed, 24 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c > > > > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c > > index ebad5cb5a79..16286befd79 100644 > > --- a/gcc/simplify-rtx.c > > +++ b/gcc/simplify-rtx.c > > @@ -4587,7 +4587,8 @@ simplify_context::simplify_binary_operation_1 (rtx_code code, > > if (GET_CODE (trueop0) == VEC_SELECT > > && GET_CODE (trueop1) == VEC_SELECT > > && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0)) > > - && GET_MODE (XEXP (trueop0, 0)) == mode) > > + && GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0))) > > + == GET_MODE_INNER(mode)) > > { > > rtx par0 = XEXP (trueop0, 1); > > rtx par1 = XEXP (trueop1, 1); > > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c > > new file mode 100644 > > index 00000000000..aef6855aa46 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c > > @@ -0,0 +1,21 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O -mavx512vl -mavx512dq -fno-tree-forwprop" } */ > > + > > +typedef double v2df __attribute__ ((__vector_size__ (16))); > > +typedef double v4df __attribute__ ((__vector_size__ (32))); > > + > > +v2df h (v4df x) > > +{ > > + v2df xx = { x[2], x[3] }; > > + return xx; > > +} > > + > > +v4df f2 (v4df x) > > +{ > > + v4df xx = { x[0], x[1], x[2], x[3] }; > > + return xx; > > +} > > + > > +/* { dg-final { scan-assembler-not "unpck" } } */ > > +/* { dg-final { scan-assembler-not "valign" } } */ > > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 } } */ > > diff --git a/gcc/testsuite/gcc.target/i386/vect-rebuild.c b/gcc/testsuite/gcc.target/i386/vect-rebuild.c > > index 570967f6b5c..8e85b98bf1d 100644 > > --- a/gcc/testsuite/gcc.target/i386/vect-rebuild.c > > +++ b/gcc/testsuite/gcc.target/i386/vect-rebuild.c > > @@ -30,4 +30,4 @@ v2df h (v4df x) > > > > /* { dg-final { scan-assembler-not "unpck" } } */ > > /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */ > > -/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */ > > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 } } */ > -- BR, Hongtao