From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 93448 invoked by alias); 14 Jun 2017 09:52:36 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 92779 invoked by uid 89); 14 Jun 2017 09:52:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-ot0-f180.google.com Received: from mail-ot0-f180.google.com (HELO mail-ot0-f180.google.com) (74.125.82.180) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 14 Jun 2017 09:52:33 +0000 Received: by mail-ot0-f180.google.com with SMTP id s7so13497776otb.3 for ; Wed, 14 Jun 2017 02:52:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=4smxcaEgKsT2Wkn2Z8kZiy5eMFnTYpiYKHHlNDOS3Vs=; b=ifa8w3nz370VZgmnYRbL3U5Ze0whtL3BFyeuvaIjC/lSuHUzbDOoD/NWbFMszZpcqe gzHA7V2J+Ag+u2CmrfJVvkX/JY3y+iS4I4BHoaRxvHPj+seFAzleZPXAc47ErzDTlona AIyW4Kozxy0v4MErRj41kb5UiBgWwkCzLel3v1MIkPGbCwxBZnHtL38OopNoOBV+U/Rb OHfzgysUBj5Pl4pnfJQLXe7jGS0ha53A/ZEok8NnqlesuvI3lCZkPay9vw5f0lnyB2si GYUMh3cERxtQ4EgNfAA6fN4oMz17ZSaajYodXxDTnz1ovS9st88bEcg0Wh/ZxsCxeswO hnkg== X-Gm-Message-State: AKS2vOz3QZpazZUZyTsNIcmKLYT8BSNyAEj5KnG3ZxKJUuRjkncmwpQ9 Gxj8A8a4b9cEvf94qEipoYb96DZrPg== X-Received: by 10.157.17.25 with SMTP id g25mr3121582ote.105.1497433956633; Wed, 14 Jun 2017 02:52:36 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.36.8 with HTTP; Wed, 14 Jun 2017 02:52:36 -0700 (PDT) In-Reply-To: References: From: Richard Biener Date: Wed, 14 Jun 2017 09:52:00 -0000 Message-ID: Subject: Re: [PATCH GCC][04/13]Sort statements in topological order for loop distribution To: "Bin.Cheng" Cc: "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2017-06/txt/msg01012.txt.bz2 On Wed, Jun 14, 2017 at 11:25 AM, Bin.Cheng wrote: > On Wed, Jun 14, 2017 at 10:15 AM, Richard Biener > wrote: >> On Wed, Jun 14, 2017 at 9:53 AM, Bin.Cheng wrote: >>> On Tue, Jun 13, 2017 at 11:59 AM, Richard Biener >>> wrote: >>>> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng wrote: >>>>> Hi, >>>>> During the work I ran into a latent bug for distributing. For the moment we sort statements >>>>> in dominance order, but that's not enough because basic blocks may be sorted in reverse order >>>>> of execution flow. This results in wrong data dependence direction later. This patch fixes >>>>> the issue by sorting in topological order. >>>>> >>>>> Bootstrap and test on x86_64 and AArch64. Is it OK? >>>> >>>> I suppose you are fixing >>>> >>>> static int >>>> pg_add_dependence_edges (struct graph *rdg, vec loops, int dir, >>>> vec drs1, >>>> vec drs2) >>>> { >>>> ... >>>> /* Re-shuffle data-refs to be in dominator order. */ >>>> if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1)) >>>> > rdg_vertex_for_stmt (rdg, DR_STMT (dr2))) >>>> { >>>> std::swap (dr1, dr2); >>>> this_dir = -this_dir; >>>> } >>>> >>>> but then for stmts that are not "ordered" by RPO or DOM like >>>> >>>> if (flag) >>>> ... = dr1; >>>> else >>>> ... = dr2; >>>> >>>> this doesn't avoid spurious swaps? Also the code was basically >>> No, this is mainly for below case: >>> if (flag) >>> { >>> partition1: arr[i] = x; >>> } >>> partition2: arr[i] = y; >>> >>> function pg_add_dependence_edges is like: >>> /* Re-shuffle data-refs to be in dominator order. */ >>> if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1)) >>> > rdg_vertex_for_stmt (rdg, DR_STMT (dr2))) >>> { >>> std::swap (dr1, dr2); >>> this_dir = -this_dir; >>> } >>> //... >>> else if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE) >>> { >>> if (DDR_REVERSED_P (ddr)) >>> { >>> std::swap (dr1, dr2); >>> this_dir = -this_dir; >>> } >>> /* Known dependences can still be unordered througout the >>> iteration space, see gcc.dg/tree-ssa/ldist-16.c. */ >>> if (DDR_NUM_DIST_VECTS (ddr) != 1) >>> this_dir = 2; >>> /* If the overlap is exact preserve stmt order. */ >>> else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0), 1)) >>> ; >>> else >>> { >>> /* Else as the distance vector is lexicographic positive >>> swap the dependence direction. */ >>> this_dir = -this_dir; >>> } >>> } >>> For non-ZERO distance vector dependence, the second part always >>> computes src->dest dependence info correctly, as well as edge >>> direction of PG. For ZERO distance vector dependence, we rely on the >>> swap part (thus topological order) to get correct dependence >>> direction. For mentioned case, the two data references are unordered >>> in dominance relation, but ordered in RPO. This is why DOM is not >>> enough. For if-then-else case, the order actually doesn't matter, and >>> the references are unordered in either dominance relation or RPO. >>> Specifically, src->dest is always computed correctly for non-ZERO >>> distance vector cases, no matter or is passed to >>> data dependence analyzer. As for ZERO distance vector (exact >>> overlap), the order doesn't matter either because they control >>> dependent on the same condition. We can simply assume an arbitrary >>> order for it. >> >> Ok, if it only is an issue for zero-distance then yes, I agree. > Yeah, so for distribution, I think we can further simplify > pg_add_dependence_edge because swap is only necessary for > zero-distance cases. >> >>>> copied from tree-data-refs.c:find_data_references_in_loop which >>>> does iterate over get_loop_body_in_dom_order as well. So isn't the >>>> issue latent there as well? >>> In theory maybe. In practice, this is not a problem at all since loop >>> distribution is the only one handles control dependence so far. >> >> You mean the only one running into the bogus BB ordering case. >> I don't see how handling control dependences factor in here. > Yes, that's what I meant. The ordering issue doesn't exist if there > is no control dependence between basic blocks in loop body I think. I > didn't realize the autopar pass. >> >>>> >>>> That said, what about those "unordered" stmts? I suppose >>>> dependence analysis still happily computes a dependence >>>> distance but in reality we'd have to consider both execution >>>> orders? >>> As explained, there is no need to consider both orders. GCC doesn't >>> really support control dependence parallelization? >> >> I think autopar supports an arbitrary CFG inside the loops but as it >> will never split them it won't change stmt ordering for zero-distance. >> >> That said, if dependence info can be incorrect if applied to a loop >> we should fixup tree-data-refs.c as well. It might be useful >> to make get_loop_body_in_rpo_order available then (and eventually >> all _in_dom_order users can work with rpo order as well thus we >> can replace it entirely as a second step). > I can work on that as a followup patch. Is it OK? Yes. Thanks, Richard. > Thanks, > bin >> >> Thanks, >> Richard. >> >>> Thanks, >>> bin >>>> >>>> Thanks, >>>> Richard. >>>> >>>> >>>>> >>>>> Thanks, >>>>> bin >>>>> 2017-06-07 Bin Cheng >>>>> >>>>> * tree-loop-distribution.c (bb_top_order_index): New. >>>>> (bb_top_order_index_size, bb_top_order_cmp): New. >>>>> (stmts_from_loop): Use topological order. >>>>> (pass_loop_distribution::execute): Compute topological order for. >>>>> basic blocks.