From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by sourceware.org (Postfix) with ESMTPS id 2B9B43857364 for ; Mon, 9 May 2022 22:57:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2B9B43857364 Received: by mail-pj1-x1029.google.com with SMTP id x88so2838322pjj.1 for ; Mon, 09 May 2022 15:57:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=5DNBs8yAJl2f9UgHNxJLaUTq/t48732dDW6NV2+9jew=; b=1if/dJEJPs7nMeOKxs0vS2tmZCO919q9IHQEccU6M0q6aj+xtEdqWb2F/jede2La1o 1t9sfW61vzFpUiKLeiYRFM0yTK8HuFfqzNStnFc/Onn3STci/Kdkf0BvPYp8kEQ6/5Ni HbdYxPRpW4W9VIJH9qheXsdQUtyMeui2yEvBFxrtTfu2t5ZSSGcSTgejDKxBBH6O8V5K G9v8/ZJi9S+lEglqVIspFifE3g9ERNz5+MBxYaTp67BpQgs3YfdvHN5R/IyywXWfNj+G qeG1UFtYlEXXXH6+wQFGEulY+gCKYm36R+f3FuN7IYE7cHz1k2JOJFnuTMeRVNgrLIXA cNGQ== X-Gm-Message-State: AOAM532wiia6AyUeGv/LGu8aTsCiWjYM6eiLfgDPmNniXlej8Mr4aX3a AW0hrFZAcLlFV4r3x3fwQkhmAHtD63c= X-Google-Smtp-Source: ABdhPJz44Dv2MsDiubRqmmBvJF8rVWXVqH8375/xgUzoiuGKUfYxS+ofWg0ge9qYP2s8IX1rTw5BWw== X-Received: by 2002:a17:902:fe01:b0:15e:e178:e2e3 with SMTP id g1-20020a170902fe0100b0015ee178e2e3mr17818291plj.40.1652137057812; Mon, 09 May 2022 15:57:37 -0700 (PDT) Received: from [172.31.0.204] (c-73-63-24-84.hsd1.ut.comcast.net. [73.63.24.84]) by smtp.gmail.com with ESMTPSA id c13-20020a170902c1cd00b0015ee9bb2a38sm419946plc.72.2022.05.09.15.57.37 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 09 May 2022 15:57:37 -0700 (PDT) Message-ID: Date: Mon, 9 May 2022 16:57:36 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH] Strip of a vector load which is only used partially. Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <20220505050437.86261-1-hongtao.liu@intel.com> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 May 2022 22:57:41 -0000 On 5/5/2022 2:26 AM, Richard Biener via Gcc-patches wrote: > On Thu, May 5, 2022 at 7:04 AM liuhongt wrote: >> Optimize >> >> _1 = *srcp_3(D); >> _4 = VEC_PERM_EXPR <_1, _1, { 4, 5, 6, 7, 4, 5, 6, 7 }>; >> _5 = BIT_FIELD_REF <_4, 128, 0>; >> >> to >> >> _1 = *srcp_3(D); >> _5 = BIT_FIELD_REF <_1, 128, 128>; >> >> the upper will finally be optimized to >> >> _5 = BIT_FIELD_REF <*srcp_3(D), 128, 128>; >> >> Bootstrapped and regtested on x86_64-pc-linux-gnu{m32,}. >> Ok for trunk? > Hmm, tree-ssa-forwprop.cc:simplify_bitfield_ref should already > handle this in the > > if (code == VEC_PERM_EXPR > && constant_multiple_p (bit_field_offset (op), size, &idx)) > { > > part of the code - maybe that needs to be enhanced to cover > a contiguous stride in the VEC_PERM_EXPR. I see > we have > > size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type)); > if (maybe_ne (bit_field_size (op), size)) > return false; > > where it will currently bail, so adjust that to check for a > constant multiple. I also think we should only handle the > case where the new bit_field_offset alignment is not > worse than the original one. > > That said, I'd prefer if you integrate this transform with > simplify_bitfield_ref. I've got a hack here that tries to do something similar, but it's trying to catch the case where we CONSTRUCTOR feeds the BIT_FIELD_REF.  It walks the CONSTRUCTOR elements to see if an element has the right offset/size to satisify the BIT_FIELD_REF. For x264 we're often able to eliminate the VEC_PERMUTE entirely and just forward operands into the BIT_FIELD_REF. I was leaning towards moving those bits into match.pd before submitting, but if you'd prefer them in tree-ssa-forwprop, that's even easier. Jeff