From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 121022 invoked by alias); 4 May 2017 12:06:27 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 120187 invoked by uid 89); 4 May 2017 12:06:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.8 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-wr0-f173.google.com Received: from mail-wr0-f173.google.com (HELO mail-wr0-f173.google.com) (209.85.128.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 04 May 2017 12:06:23 +0000 Received: by mail-wr0-f173.google.com with SMTP id w50so6661021wrc.0 for ; Thu, 04 May 2017 05:06:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:cc:subject:references :date:in-reply-to:message-id:user-agent:mime-version; bh=NWmYxScjfHOTk1r7kPBNllUYecn4Xd7zBHyJOxKTQGM=; b=jASmSP7e7JnMS9Si65KSWJhswEg7mzdzG8NYC+e+8x7uafwdVHVDqmaChy4fvNIeKf Aa7R6lbU4WHxhHedXdx8rZvuVKVu5S5MalZIxM/Q2QWEplpRgdZ9JzpWpacRPIYcOMHw Rtmjlb5wN8/nWmS5IzvFCUBJtS0UyNZelPJDoUUmfUrspYGbBdxqg64wRhFaMUAubNI5 f3ku2CuzLSI3KVolX5CzQB5R4vBAEd4jEVmgqduVmg00QOEF0Nwu52NhLN6c0pSH6iKB D/UQNgK/fKKBDttZdzZdWubhyO4hwkwrTrE3t2yna2b9ijEUX+/mX5XDIOmIOaCWyDQx 3Nxw== X-Gm-Message-State: AN3rC/7O6TPnt4RLxjKKHiUM7WHF24Iw2BTbkvlxVxLPRlIc7f/xoXj7 lvHuSlG+D/CYLpUa X-Received: by 10.223.135.84 with SMTP id 20mr17815781wrz.141.1493899583539; Thu, 04 May 2017 05:06:23 -0700 (PDT) Received: from localhost (92.40.248.22.threembb.co.uk. [92.40.248.22]) by smtp.gmail.com with ESMTPSA id w18sm1927799wra.34.2017.05.04.05.06.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 04 May 2017 05:06:22 -0700 (PDT) From: Richard Sandiford To: "Bin.Cheng" Mail-Followup-To: "Bin.Cheng" ,gcc-patches List , richard.sandiford@linaro.org Cc: gcc-patches List Subject: Re: Handle data dependence relations with different bases References: <87tw52s73r.fsf@linaro.org> <87efw53pik.fsf@linaro.org> Date: Thu, 04 May 2017 12:11:00 -0000 In-Reply-To: (Bin Cheng's message of "Thu, 4 May 2017 11:49:15 +0100") Message-ID: <87a86s4yir.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SW-Source: 2017-05/txt/msg00291.txt.bz2 "Bin.Cheng" writes: > On Thu, May 4, 2017 at 11:06 AM, Richard Sandiford > wrote: >> "Bin.Cheng" writes: >>> On Wed, May 3, 2017 at 9:00 AM, Richard Sandiford >>> wrote: >>>> Index: gcc/tree-data-ref.c >>>> =================================================================== >>>> --- gcc/tree-data-ref.c 2017-02-23 19:54:15.000000000 +0000 >>>> +++ gcc/tree-data-ref.c 2017-05-03 08:48:48.737038502 +0100 >>>> @@ -123,8 +123,7 @@ Software Foundation; either version 3, o >>>> } dependence_stats; >>>> >>>> static bool subscript_dependence_tester_1 (struct > data_dependence_relation *, >>>> - struct data_reference *, >>>> - struct data_reference *, >>>> + unsigned int, unsigned int, >>>> struct loop *); >>> As mentioned, how about passing access_fn directly, rather than less >>> meaningful 0/1 values? >> >> The problem is that access_fn is a property of the individual >> subscripts, whereas this is operating on a full data_reference. >> >> One alternative would be to use conditions like: >> >> first_is_a ? SUB_ACCESS_FN_A (sub) : SUB_ACCESS_FN_B (sub) >> >> but IMO that's less readable than the existing: >> >> SUB_ACCESS_FN (sub, index) >> >> Or we could have individual access_fn arrays for A and B, separate >> from the main subscript array, but that would mean allocating three >> arrays instead of one. > Thanks for explanation, I see the problem now. Even the latter > sequence could be different for A and B, there should have the same > number index? If that's the case, is it possible just recording the > starting position (or length) in DR_ACCESS_FN and use that information > to access to access_fn vector. This can save the copy in subscript. > Anyway, this is not am important problem. I think that would mean trading fields in the subscript for fields in the main ddr structure. One advantage of doing it in the subscript is that those are freed after the analysis is complete, whereas the ddr stays around until the caller has finished with it. >>>> + latter sequence. */ >>>> + unsigned int start_a = 0; >>>> + unsigned int start_b = 0; >>>> + unsigned int num_dimensions = 0; >>>> + unsigned int struct_start_a = 0; >>>> + unsigned int struct_start_b = 0; >>>> + unsigned int struct_num_dimensions = 0; >>>> + unsigned int index_a = 0; >>>> + unsigned int index_b = 0; >>>> + tree next_ref_a = DR_REF (a); >>>> + tree next_ref_b = DR_REF (b); >>>> + tree struct_ref_a = NULL_TREE; >>>> + tree struct_ref_b = NULL_TREE; >>>> + while (index_a < num_dimensions_a && index_b < num_dimensions_b) >>>> + { >>>> + gcc_checking_assert (handled_component_p (next_ref_a)); >>>> + gcc_checking_assert (handled_component_p (next_ref_b)); >>>> + tree outer_ref_a = TREE_OPERAND (next_ref_a, 0); >>>> + tree outer_ref_b = TREE_OPERAND (next_ref_b, 0); >>>> + tree type_a = TREE_TYPE (outer_ref_a); >>>> + tree type_b = TREE_TYPE (outer_ref_b); >>>> + if (types_compatible_p (type_a, type_b)) >>>> + { >>>> + /* This pair of accesses belong to a suitable sequence. */ >>>> + if (start_a + num_dimensions != index_a >>>> + || start_b + num_dimensions != index_b) >>>> + { >>>> + /* Start a new sequence here. */ >>>> + start_a = index_a; >>>> + start_b = index_b; >>>> + num_dimensions = 0; >>>> + } >>>> + num_dimensions += 1; >>>> + if (TREE_CODE (type_a) == RECORD_TYPE) >>>> + { >>>> + struct_start_a = start_a; >>>> + struct_start_b = start_b; >>>> + struct_num_dimensions = num_dimensions; >>>> + struct_ref_a = outer_ref_a; >>>> + struct_ref_b = outer_ref_b; >>>> + } >>>> + next_ref_a = outer_ref_a; >>>> + next_ref_b = outer_ref_b; >>>> + index_a += 1; >>>> + index_b += 1; >>>> + continue; >>>> + } >>>> + /* Try to approach equal type sizes. */ >>>> + if (!COMPLETE_TYPE_P (type_a) >>>> + || !COMPLETE_TYPE_P (type_b) >>>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a)) >>>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b))) >>>> + break; >>>> + unsigned HOST_WIDE_INT size_a = tree_to_uhwi (TYPE_SIZE_UNIT > (type_a)); >>>> + unsigned HOST_WIDE_INT size_b = tree_to_uhwi (TYPE_SIZE_UNIT > (type_b)); >>>> + if (size_a <= size_b) >>>> + { >>>> + index_a += 1; >>>> + next_ref_a = outer_ref_a; >>>> + } >>>> + if (size_b <= size_a) >>>> + { >>>> + index_b += 1; >>>> + next_ref_b = outer_ref_b; >>>> + } >>>> } >>>> >>>> - /* If the references do not access the same object, we do not know >>>> - whether they alias or not. We do not care about TBAA or alignment >>>> - info so we can use OEP_ADDRESS_OF to avoid false negatives. >>>> - But the accesses have to use compatible types as otherwise the >>>> - built indices would not match. */ >>>> - if (!operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b), >>> OEP_ADDRESS_OF) >>>> - || !types_compatible_p (TREE_TYPE (DR_BASE_OBJECT (a)), >>>> - TREE_TYPE (DR_BASE_OBJECT (b)))) >>>> + /* See whether the sequence ends at the base and whether the two bases >>>> + are equal. We do not care about TBAA or alignment info so we can use >>>> + OEP_ADDRESS_OF to avoid false negatives. */ >>>> + tree base_a = DR_BASE_OBJECT (a); >>>> + tree base_b = DR_BASE_OBJECT (b); >>>> + bool same_base_p = (start_a + num_dimensions == num_dimensions_a >>>> + && start_b + num_dimensions == num_dimensions_b >>>> + && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b) >>>> + && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF) >>>> + && types_compatible_p (TREE_TYPE (base_a), >>>> + TREE_TYPE (base_b)) >>>> + && (!loop_nest.exists () >>>> + || (object_address_invariant_in_loop_p >>>> + (loop_nest[0], base_a)))); >>> Major change is in function initialize_data_dependence_relation in >>> order to detect partial alias opportunity. The original equality >>> check on DR_BASE_OBJECT is bypassed now. IMHO better to introduce a >>> new parameter to compute_data_reference_for_loop etc., indicating >>> whether we want to handle partial alias opportunity or not. After >>> all, such computation is unnecessary for predcom/prefetch/parloop. >>> It's only a waste of time computing it. >> >> Well, it also means that we can now prove the accesses are independent >> in more cases. E.g. previously we would assume the a and b accesses in: > Predcom cares about dependent refs with constant distance, so > independent (neither possible dependent) information based on partial > alias is not interested. Yeah, but it checks for that itself, based on the individual data references. E.g.: if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0) || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)) return false; This condition is independent of the ddr. My point is that if we wanted to avoid doing redundant work in this kind of situation, we should reorganise things so that we filter uninteresting pairs of references out *before* creating the ddr. Even if we passed a flag down to initialize_data_dependence_relation, we'd still end up with a ddr for the uninteresting pairs, just like we have now. And if you're using compute_all_dependences, those pairs still count towards PARAM_LOOP_MAX_DATAREFS_FOR_DATADEPS. (I think the only thing predcom uses the ddr for is to check for relations that are known to be independent. So creating specific distance and direction vectors is itself redundant work.) >> >> struct s { int x[16]; } *a, *b; >> for (int i = 0; i < 8; ++i) >> a->x[i] = b->x[i + 8]; >> >> could conflict. >> >> If callers don't need to know what the relationship between a and b is, >> I think they should check for that before going through the process of >> initialising and analysing the ddr. > This I don't think so. Users don't have the information to > pre-check/analyze reference pair. Even it can do that by repeating > most work as in data-ref-analyzer, it sounds not a good practice. > That's exactly analyzer's job and the reason why interfaces like > compute_data_dependence_for_loop are introduced. It doesn't make much > sense requiring users to do additional analysis before looking for > help from data-ref-analyzer. (The reply above was for this too.) Thanks, Richard