From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by sourceware.org (Postfix) with ESMTPS id 626713858438 for ; Thu, 22 Sep 2022 09:29:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 626713858438 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 222CD1F88F; Thu, 22 Sep 2022 09:29:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1663838981; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=0qfYrWueuFX0vFTNIvOElhpXoCx+z1n2LXqPLiTl+Ik=; b=Mo8SEZZcv26+Rax1cdS0bmeh8PiYkC+HHPgWzsAFJNyhLTIY4njii5sYBg8UTAueDl1uoG TaqxMeaALa6mjBAKWBHk8R/kfI/GSXtDKqGjseDVgYsNlH3XD2/gHCVrDoq+rbVWRuSZDI xmXGOEB56CpeE+5QDks8oGEMRfoBP4Q= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1663838981; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=0qfYrWueuFX0vFTNIvOElhpXoCx+z1n2LXqPLiTl+Ik=; b=pU14QcV5Wy9VVrcVFdqkkA+o66H8qHZ84gcsMLKIOwPXq7XeWQK9C48aLVEA9pAbkIdRtK dWGxRj8Iz56EtpAQ== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 03C392C141; Thu, 22 Sep 2022 09:29:40 +0000 (UTC) Date: Thu, 22 Sep 2022 09:29:40 +0000 (UTC) From: Richard Biener To: "juzhe.zhong@rivai.ai" cc: gcc-patches Subject: Re: Re: [PATCH] DSE: Enhance dse with def-ref analysis In-Reply-To: <557C473EC7DDBF2B+2022092217123939892325@rivai.ai> Message-ID: References: <20220922070625.7197-1-juzhe.zhong@rivai.ai>, , , <2794D55ECC21EB61+2022092215574700429612@rivai.ai>, , , <557C473EC7DDBF2B+2022092217123939892325@rivai.ai> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 22 Sep 2022, juzhe.zhong@rivai.ai wrote: > Hi, Richard. I tried your suggestion which is applying your code and PR106019. > It works for me now. Thank you so much. > > I will apply your suggestion on RVV GCC12.2 downstream (Because it has not been supported on upstream). > > I have another question: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99409 > It seems that this issue occurs because GCC miss scalar-expansion optimization. > > I read the book ?Compiler Challenges for High-Performance Architectures?. > There is a chapter: Chapter 5.3 Scalar Expansion. > Is it a good idea to implement a new pass in GCC following the scalar expansion algorithm this book provided? > Or you have another better option to fix this issue ? Thanks. Since this is about vectorization the more canonical place to perform this is during if-conversion where we create a loop copy with transforms applied that help vectorization. It would be also nice to have a look at LLVM to see how they tackle this specific case (they seem to manage with registers and shuffling) Richard. > > juzhe.zhong@rivai.ai > > From: Richard Biener > Date: 2022-09-22 16:48 > To: juzhe.zhong@rivai.ai > CC: gcc-patches > Subject: Re: Re: [PATCH] DSE: Enhance dse with def-ref analysis > On Thu, 22 Sep 2022, juzhe.zhong@rivai.ai wrote: > > > Does your local code exclude my codes? > > I am using GCC12.2. When I delete all my codes and apply your codes only. > > It fails to delete redundant stores and no auto-vecotorization of my RVV GCC in this test. > > I am not sure whether I am on the same page with you. > > I applied my patch to GCC master where it handles the testcase > from the PR in the first 042t.dse1 pass. I have not applied your > patch. The patch needs an amendment to pass bootstrap, > > if (is_gimple_assign (use_stmt)) > > needs to be > > if (ref->ref && is_gimple_assign (use_stmt)) > > testing then also reveals > > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s243.c -flto -ffat-lto-objects > scan-tree-dump vect "vectorized 1 loops" > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s243.c scan-tree-dump vect "vectorized 1 > loops" > > I guess that's expected. Indeed when applying the patch to the > GCC 12 branch the case isn't optimized. I think it's probably > the PR106019 fix missing, aka r13-1203-g038b077689bb53 > > Richard. > > > > > > > > > juzhe.zhong@rivai.ai > > > > From: Richard Biener > > Date: 2022-09-22 16:01 > > To: juzhe.zhong@rivai.ai > > Subject: Re: Re: [PATCH] DSE: Enhance dse with def-ref analysis > > On Thu, 22 Sep 2022, juzhe.zhong@rivai.ai wrote: > > > > > I tried this solution you gave: > > > >> else if (ref_maybe_used_by_stmt_p (use_stmt, ref)) > > > >> { > > > >> if (is_gimple_assign (use_stmt)) > > > >> { > > > >> data_reference_p dra, drb; > > > >> dra = create_data_ref (NULL, NULL, ref->ref, stmt, > > > >> false, false); > > > >> drb = create_data_ref (NULL, NULL, > > > >> gimple_assign_rhs1 (use_stmt), > > > >> use_stmt, false, false); > > > >> bool alias_p = dr_may_alias_p (dra, drb, NULL); > > > >> free_data_ref (dra); > > > >> free_data_ref (drb); > > > >> if (!alias_p) > > > >> { > > > >> if (gimple_vdef (use_stmt)) > > > >> defs.safe_push (use_stmt); > > > >> continue; > > > >> } > > > >> } > > > > > > It still fails to delete the redundant store. The reason is when checking the redundant store. > > > it didn't match the condtion: ref_maybe_used_by_stmt_p (use_stmt, ref). > > > > It does for me: > > > > Deleted dead store: a[i_18] = _5; > > > > ... > > > > : > > _1 = b[i_18]; > > _2 = c[i_18]; > > _3 = d[i_18]; > > _4 = _2 * _3; > > _5 = _1 + _4; > > _8 = e[i_18]; > > _9 = _3 * _8; > > _10 = _5 + _9; > > b[i_18] = _10; > > _12 = i_18 + 1; > > _13 = a[_12]; > > _15 = _3 * _13; > > _16 = _10 + _15; > > a[i_18] = _16; > > > > the other relevant function is stmt_kills_ref_p, that one does > > handle a[i_18] vs. a[i_18] just fine. > > > > > Maybe we should first figure why it doesn't satisfy this situation? > > > > > > > > > juzhe.zhong@rivai.ai > > > > > > From: Richard Biener > > > Date: 2022-09-22 15:44 > > > To: Ju-Zhe Zhong > > > CC: gcc-patches; richard.sandiford > > > Subject: Re: [PATCH] DSE: Enhance dse with def-ref analysis > > > On Thu, 22 Sep 2022, Richard Biener wrote: > > > > > > > On Thu, 22 Sep 2022, juzhe.zhong@rivai.ai wrote: > > > > > > > > > From: Ju-Zhe Zhong > > > > > > > > > > This patch fix issue: PR 99407 > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99407 > > > > > > > > > > The enhancement implementation is simple: > > > > > 1.Search gimple statement in program reverse order. > > > > > 2.Queue the store statement which may be possible kill the def > > > > > of previous store statement. > > > > > 3.Perform dse_def_ref_analysis to remove stores will not kill > > > > > any def. > > > > > For example: > > > > > a[i_18] = _5; > > > > > ... > > > > > foo (&a); > > > > > a[i_18] = _7; > > > > > > > > > > a[i_18] = _7 is queued at the begining and will be removed > > > > > in dse_def_ref_analysis. > > > > > 4.Remove the store if the def is confirmed to be killed. > > > > > > > > But we already do the very same thing in dse_classify_store, I fail > > > > to see why we need to have an alternate implementation? It also > > > > seems to be quadratic in the size of a basic-block? > > > > > > > > The issue with dse_classify_store is that it relies on > > > > ref_maybe_used_by_stmt_p but that doesn't handle > > > > > > > > a[i] = ..; > > > > .. = a[i+1]; > > > > > > > > but when seeing a[_1] vs. a[_2] (two variable offsets), it gives > > > > up, asserting may-aliasing. We do have infrastructure to catch > > > > such cases with data reference analysis. If we want to catch > > > > these cases we should use that instead. Given we have a > > > > DSE/DCE pass pair right before loop optimizations we could even > > > > move those inside of the loop pipeline and perform this more > > > > expensive checks conditional on loop/scev availability. > > > > > > Oh, and when doing non-loop aware analysis we don't need SCEV. The > > > following optimizes the testcase but as said I don't think we want > > > to perform this for each of the DSE passes since it can be somewhat > > > expensive, at least without doing more caching (we could keep a > > > stmt -> data-ref hash-map and compute data-refs at most once for each > > > statement, that would make it more acceptable). > > > > > > Richard. > > > > > > From 515b213e9d06c2bd36160e66728f57e48095bb84 Mon Sep 17 00:00:00 2001 > > > From: Richard Biener > > > Date: Thu, 22 Sep 2022 09:40:40 +0200 > > > Subject: [PATCH] tree-optimization/99407 - DSE with data-ref analysis > > > To: gcc-patches@gcc.gnu.org > > > > > > * tree-ssa-dse.c (dse_classify_store): Use data-ref analysis > > > to disambiguate more uses. > > > --- > > > gcc/tree-ssa-dse.cc | 21 +++++++++++++++++++++ > > > 1 file changed, 21 insertions(+) > > > > > > diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc > > > index 34cfd1a8802..340a54f4105 100644 > > > --- a/gcc/tree-ssa-dse.cc > > > +++ b/gcc/tree-ssa-dse.cc > > > @@ -45,6 +45,8 @@ along with GCC; see the file COPYING3. If not see > > > #include "ipa-modref.h" > > > #include "target.h" > > > #include "tree-ssa-loop-niter.h" > > > +#include "cfgloop.h" > > > +#include "tree-data-ref.h" > > > /* This file implements dead store elimination. > > > @@ -1019,6 +1021,25 @@ dse_classify_store (ao_ref *ref, gimple *stmt, > > > /* If the statement is a use the store is not dead. */ > > > else if (ref_maybe_used_by_stmt_p (use_stmt, ref)) > > > { > > > + if (is_gimple_assign (use_stmt)) > > > + { > > > + data_reference_p dra, drb; > > > + dra = create_data_ref (NULL, NULL, ref->ref, stmt, > > > + false, false); > > > + drb = create_data_ref (NULL, NULL, > > > + gimple_assign_rhs1 (use_stmt), > > > + use_stmt, false, false); > > > + bool alias_p = dr_may_alias_p (dra, drb, NULL); > > > + free_data_ref (dra); > > > + free_data_ref (drb); > > > + if (!alias_p) > > > + { > > > + if (gimple_vdef (use_stmt)) > > > + defs.safe_push (use_stmt); > > > + continue; > > > + } > > > + } > > > + > > > /* Handle common cases where we can easily build an ao_ref > > > structure for USE_STMT and in doing so we find that the > > > references hit non-live bytes and thus can be ignored. > > > > > > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)