From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 103337 invoked by alias); 20 Jun 2017 09:20:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 103238 invoked by uid 89); 20 Jun 2017 09:20:46 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-ua0-f173.google.com Received: from mail-ua0-f173.google.com (HELO mail-ua0-f173.google.com) (209.85.217.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Jun 2017 09:20:44 +0000 Received: by mail-ua0-f173.google.com with SMTP id d45so35171330uai.1 for ; Tue, 20 Jun 2017 02:20:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=f6IJGHIhnFvSVj2bXAgvPNmp7+Cn8O8xpIkEbUwlPIg=; b=N4EgNVVthRD/HOyFwuspvxEgsnCnyUvUi+yruuitfav4+s3r1BCOYHXgNYJ+9w2VXT G78OnAqlEVYm9+brA0PBM86c5O1rEKIRDcLge+qL+JV4i1DugwH2r54OqB+IDxuhlNy4 SyfR/f2bj1Kpf3yK+m6oVQtXo9YGZYwPzTvutAQsa5181t8jZ3qcDZF4bdk+rekOEMVk 7hvHY48aVrbwST7R398FVZqOXSoIYDT+ObA/ZcaswsbWrWac/ZFYJgnflkodRxuKVUhJ 0tKghIxSVaXwgfjcxnrWexR02QMzR7PWGLv/fbhgzdYRYUf08anIoVXaiZHls0a+RUhc abLQ== X-Gm-Message-State: AKS2vOyCCvyc/DBC9UU6onFRMGCcP2ht2dBVvRzY3+PkLRmwC6CxFKFk pPZhpnHV+aKFMq5S9bs+/NvWUURFFg== X-Received: by 10.159.48.1 with SMTP id h1mr17994185uab.102.1497950442524; Tue, 20 Jun 2017 02:20:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.49.142 with HTTP; Tue, 20 Jun 2017 02:20:41 -0700 (PDT) In-Reply-To: References: From: "Bin.Cheng" Date: Tue, 20 Jun 2017 09:20:00 -0000 Message-ID: Subject: Re: [PATCH GCC][12/13]Workaround reduction statements for distribution To: Richard Biener Cc: "gcc-patches@gcc.gnu.org" Content-Type: multipart/mixed; boundary="f403045e1f4204ab7d055260c440" X-IsSubscribed: yes X-SW-Source: 2017-06/txt/msg01411.txt.bz2 --f403045e1f4204ab7d055260c440 Content-Type: text/plain; charset="UTF-8" Content-length: 2719 On Fri, Jun 16, 2017 at 6:15 PM, Bin.Cheng wrote: > On Fri, Jun 16, 2017 at 11:21 AM, Richard Biener > wrote: >> On Mon, Jun 12, 2017 at 7:03 PM, Bin Cheng wrote: >>> Hi, >>> For now, loop distribution handles variables used outside of loop as reduction. >>> This is inaccurate because all partitions contain statement defining induction >>> vars. >> >> But final induction values are usually not used outside of the loop... > This is in actuality for induction variable which is used outside of the loop. >> >> What is missing is loop distribution trying to change partition order. In fact >> we somehow assume we can move a reduction across a detected builtin >> (I don't remember if we ever check for validity of that...). > Hmm, I am not sure when we can't. If there is any dependence between > builtin/reduction partitions, it should be captured by RDG or PG, > otherwise the partitions are independent and can be freely ordered as > long as reduction partition is scheduled last? >> >>> Ideally we should factor out scev-propagation as a standalone interface >>> which can be called when necessary. Before that, this patch simply workarounds >>> reduction issue by checking if the statement belongs to all partitions. If yes, >>> the reduction must be computed in the last partition no matter how the loop is >>> distributed. >>> Bootstrap and test on x86_64 and AArch64. Is it OK? >> >> stmt_in_all_partitions is not kept up-to-date during partition merging and if >> merging makes the reduction partition(s) pass the stmt_in_all_partitions >> test your simple workaround doesn't work ... > I think it doesn't matter because: > A) it's really workaround for induction variables. In general, > induction variables are included by all partition. > B) After classify partition, we immediately fuses all reduction > partitions. More stmt_in_all_partitions means we are fusing > non-reduction partition with reduction partition, so the newly > generated (stmt_in_all_partitions) are actually not reduction > statements. The workaround won't work anyway even the bitmap is > maintained. >> >> As written it's a valid optimization but can you please note it's limitation in >> some comment please? > Yeah, I will add comment explaining it. Comment added in new version patch. It also computes bitmap outside now, is it OK? Thanks, bin 2017-06-07 Bin Cheng * tree-loop-distribution.c (classify_partition): New parameter and better handle reduction statement. (rdg_build_partitions): Revise comment. (distribute_loop): Compute statements in all partitions and pass it to classify_partition. --f403045e1f4204ab7d055260c440 Content-Type: text/x-patch; charset="US-ASCII"; name="0011-reduction-workaround-20170607.txt.patch" Content-Disposition: attachment; filename="0011-reduction-workaround-20170607.txt.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_j45d0adq0 Content-length: 5462 RnJvbSBiMzUwMmQ3MTMzMDlkYTA4ZDkzY2Q1M2U1ZmU4ZmJmZGNjZjM1NTdi IE1vbiBTZXAgMTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBCaW4gQ2hlbmcgPGJp bmNoZTAxQGUxMDg0NTEtbGluLmNhbWJyaWRnZS5hcm0uY29tPgpEYXRlOiBG cmksIDkgSnVuIDIwMTcgMTM6MjE6MDcgKzAxMDAKU3ViamVjdDogW1BBVENI IDExLzEzXSByZWR1Y3Rpb24td29ya2Fyb3VuZC0yMDE3MDYwNy50eHQKCi0t LQogZ2NjL3RyZWUtbG9vcC1kaXN0cmlidXRpb24uYyB8IDQzICsrKysrKysr KysrKysrKysrKysrKysrKysrKysrKysrLS0tLS0tLS0tLS0KIDEgZmlsZSBj aGFuZ2VkLCAzMiBpbnNlcnRpb25zKCspLCAxMSBkZWxldGlvbnMoLSkKCmRp ZmYgLS1naXQgYS9nY2MvdHJlZS1sb29wLWRpc3RyaWJ1dGlvbi5jIGIvZ2Nj L3RyZWUtbG9vcC1kaXN0cmlidXRpb24uYwppbmRleCBkNzQxZTllLi4xYjRl MjM4IDEwMDY0NAotLS0gYS9nY2MvdHJlZS1sb29wLWRpc3RyaWJ1dGlvbi5j CisrKyBiL2djYy90cmVlLWxvb3AtZGlzdHJpYnV0aW9uLmMKQEAgLTEyMjYs MTcgKzEyMjYsMTggQEAgYnVpbGRfcmRnX3BhcnRpdGlvbl9mb3JfdmVydGV4 IChzdHJ1Y3QgZ3JhcGggKnJkZywgaW50IHYpCiB9CiAKIC8qIENsYXNzaWZp ZXMgdGhlIGJ1aWx0aW4ga2luZCB3ZSBjYW4gZ2VuZXJhdGUgZm9yIFBBUlRJ VElPTiBvZiBSREcgYW5kIExPT1AuCi0gICBGb3IgdGhlIG1vbWVudCB3ZSBk ZXRlY3Qgb25seSB0aGUgbWVtc2V0IHplcm8gcGF0dGVybi4gICovCisgICBG b3IgdGhlIG1vbWVudCB3ZSBkZXRlY3QgbWVtc2V0LCBtZW1jcHkgYW5kIG1l bW1vdmUgcGF0dGVybnMuICBCaXRtYXAKKyAgIFNUTVRfSU5fQUxMX1BBUlRJ VElPTlMgY29udGFpbnMgc3RhdGVtZW50cyBiZWxvbmdpbmcgdG8gYWxsIHBh cnRpdGlvbnMuICAqLwogCiBzdGF0aWMgdm9pZAotY2xhc3NpZnlfcGFydGl0 aW9uIChsb29wX3AgbG9vcCwgc3RydWN0IGdyYXBoICpyZGcsIHBhcnRpdGlv biAqcGFydGl0aW9uKQorY2xhc3NpZnlfcGFydGl0aW9uIChsb29wX3AgbG9v cCwgc3RydWN0IGdyYXBoICpyZGcsIHBhcnRpdGlvbiAqcGFydGl0aW9uLAor CQkgICAgYml0bWFwIHN0bXRfaW5fYWxsX3BhcnRpdGlvbnMpCiB7CiAgIGJp dG1hcF9pdGVyYXRvciBiaTsKICAgdW5zaWduZWQgaTsKICAgdHJlZSBuYl9p dGVyOwogICBkYXRhX3JlZmVyZW5jZV9wIHNpbmdsZV9sb2FkLCBzaW5nbGVf c3RvcmU7Ci0gIGJvb2wgdm9sYXRpbGVzX3AgPSBmYWxzZTsKLSAgYm9vbCBw bHVzX29uZSA9IGZhbHNlOworICBib29sIHZvbGF0aWxlc19wID0gZmFsc2Us IHBsdXNfb25lID0gZmFsc2UsIGhhc19yZWR1Y3Rpb24gPSBmYWxzZTsKIAog ICBwYXJ0aXRpb24tPmtpbmQgPSBQS0lORF9OT1JNQUw7CiAgIHBhcnRpdGlv bi0+bWFpbl9kciA9IE5VTEw7CkBAIC0xMjUxLDE2ICsxMjUyLDMxIEBAIGNs YXNzaWZ5X3BhcnRpdGlvbiAobG9vcF9wIGxvb3AsIHN0cnVjdCBncmFwaCAq cmRnLCBwYXJ0aXRpb24gKnBhcnRpdGlvbikKICAgICAgIGlmIChnaW1wbGVf aGFzX3ZvbGF0aWxlX29wcyAoc3RtdCkpCiAJdm9sYXRpbGVzX3AgPSB0cnVl OwogCi0gICAgICAvKiBJZiB0aGUgc3RtdCBoYXMgdXNlcyBvdXRzaWRlIG9m IHRoZSBsb29wIG1hcmsgaXQgYXMgcmVkdWN0aW9uLiAgKi8KKyAgICAgIC8q IElmIHRoZSBzdG10IGlzIG5vdCBpbmNsdWRlZCBieSBhbGwgcGFydGl0aW9u cyBhbmQgdGhlcmUgaXMgdXNlcworCSBvdXRzaWRlIG9mIHRoZSBsb29wLCB0 aGVuIG1hcmsgdGhlIHBhcnRpdGlvbiBhcyByZWR1Y3Rpb24uICAqLwogICAg ICAgaWYgKHN0bXRfaGFzX3NjYWxhcl9kZXBlbmRlbmNlc19vdXRzaWRlX2xv b3AgKGxvb3AsIHN0bXQpKQogCXsKLQkgIHBhcnRpdGlvbi0+cmVkdWN0aW9u X3AgPSB0cnVlOwotCSAgcmV0dXJuOworCSAgLyogRHVlIHRvIGxpbWl0YXRp b24gaW4gdGhlIHRyYW5zZm9ybSBwaGFzZSB3ZSBoYXZlIHRvIGZ1c2UgYWxs CisJICAgICByZWR1Y3Rpb24gcGFydGl0aW9ucy4gIEFzIGEgcmVzdWx0LCB0 aGlzIGNvdWxkIGNhbmNlbCB2YWxpZAorCSAgICAgbG9vcCBkaXN0cmlidXRp b24gZXNwZWNpYWxseSBmb3IgbG9vcCB0aGF0IGluZHVjdGlvbiB2YXJpYWJs ZQorCSAgICAgaXMgdXNlZCBvdXRzaWRlIG9mIGxvb3AuICBUbyB3b3JrYXJv dW5kIHRoaXMgaXNzdWUsIHdlIHNraXAKKwkgICAgIG1hcmtpbmcgcGFydGl0 aW9uIGFzIHJldWRjdGlvbiBpZiB0aGUgcmVkdWN0aW9uIHN0bXQgYmVsb25n cworCSAgICAgdG8gYWxsIHBhcnRpdGlvbnMuICBJbiBzdWNoIGNhc2UsIHJl ZHVjdGlvbiB3aWxsIGJlIGNvbXB1dGVkCisJICAgICBjb3JyZWN0bHkgbm8g bWF0dGVyIGhvdyBwYXJ0aXRpb25zIGFyZSBmdXNlZC9kaXN0cmlidXRlZC4g ICovCisJICBpZiAoIWJpdG1hcF9iaXRfcCAoc3RtdF9pbl9hbGxfcGFydGl0 aW9ucywgaSkpCisJICAgIHsKKwkgICAgICBwYXJ0aXRpb24tPnJlZHVjdGlv bl9wID0gdHJ1ZTsKKwkgICAgICByZXR1cm47CisJICAgIH0KKwkgIGhhc19y ZWR1Y3Rpb24gPSB0cnVlOwogCX0KICAgICB9CiAKICAgLyogUGVyZm9ybSBn ZW5lcmFsIHBhcnRpdGlvbiBkaXNxdWFsaWZpY2F0aW9uIGZvciBidWlsdGlu cy4gICovCiAgIGlmICh2b2xhdGlsZXNfcAorICAgICAgLyogU2ltcGxlIHdv cmthcm91bmQgdG8gcHJldmVudCBjbGFzc2lmeWluZyB0aGUgcGFydGl0aW9u IGFzIGJ1aWx0aW4KKwkgaWYgaXQgY29udGFpbnMgYW55IHVzZSBvdXRzaWRl IG9mIGxvb3AuICAqLworICAgICAgfHwgaGFzX3JlZHVjdGlvbgogICAgICAg fHwgIWZsYWdfdHJlZV9sb29wX2Rpc3RyaWJ1dGVfcGF0dGVybnMpCiAgICAg cmV0dXJuOwogCkBAIC0xNDM1LDkgKzE0NTEsOSBAQCBzaGFyZV9tZW1vcnlf YWNjZXNzZXMgKHN0cnVjdCBncmFwaCAqcmRnLAogICByZXR1cm4gZmFsc2U7 CiB9CiAKLS8qIEFnZ3JlZ2F0ZSBzZXZlcmFsIGNvbXBvbmVudHMgaW50byBh IHVzZWZ1bCBwYXJ0aXRpb24gdGhhdCBpcwotICAgcmVnaXN0ZXJlZCBpbiB0 aGUgUEFSVElUSU9OUyB2ZWN0b3IuICBQYXJ0aXRpb25zIHdpbGwgYmUKLSAg IGRpc3RyaWJ1dGVkIGluIGRpZmZlcmVudCBsb29wcy4gICovCisvKiBGb3Ig ZWFjaCBzZWVkIHN0YXRlbWVudCBpbiBTVEFSVElOR19TVE1UUywgdGhpcyBm dW5jdGlvbiBidWlsZHMKKyAgIHBhcnRpdGlvbiBmb3IgaXQgYnkgYWRkaW5n IGRlcGVuZGVkIHN0YXRlbWVudHMgYWNjb3JkaW5nIHRvIFJERy4KKyAgIEFs bCBwYXJ0aXRpb25zIGFyZSByZWNvcmRlZCBpbiBQQVJUSVRJT05TLiAgKi8K IAogc3RhdGljIHZvaWQKIHJkZ19idWlsZF9wYXJ0aXRpb25zIChzdHJ1Y3Qg Z3JhcGggKnJkZywKQEAgLTE3MDUsMTAgKzE3MjEsMTUgQEAgZGlzdHJpYnV0 ZV9sb29wIChzdHJ1Y3QgbG9vcCAqbG9vcCwgdmVjPGdpbXBsZSAqPiBzdG10 cywKICAgYXV0b192ZWM8c3RydWN0IHBhcnRpdGlvbiAqLCAzPiBwYXJ0aXRp b25zOwogICByZGdfYnVpbGRfcGFydGl0aW9ucyAocmRnLCBzdG10cywgJnBh cnRpdGlvbnMpOwogCisgIGF1dG9fYml0bWFwIHN0bXRfaW5fYWxsX3BhcnRp dGlvbnM7CisgIGJpdG1hcF9jb3B5IChzdG10X2luX2FsbF9wYXJ0aXRpb25z LCBwYXJ0aXRpb25zWzBdLT5zdG10cyk7CisgIGZvciAoaSA9IDE7IHBhcnRp dGlvbnMuaXRlcmF0ZSAoaSwgJnBhcnRpdGlvbik7ICsraSkKKyAgICBiaXRt YXBfYW5kX2ludG8gKHN0bXRfaW5fYWxsX3BhcnRpdGlvbnMsIHBhcnRpdGlv bnNbaV0tPnN0bXRzKTsKKwogICBhbnlfYnVpbHRpbiA9IGZhbHNlOwogICBG T1JfRUFDSF9WRUNfRUxUIChwYXJ0aXRpb25zLCBpLCBwYXJ0aXRpb24pCiAg ICAgewotICAgICAgY2xhc3NpZnlfcGFydGl0aW9uIChsb29wLCByZGcsIHBh cnRpdGlvbik7CisgICAgICBjbGFzc2lmeV9wYXJ0aXRpb24gKGxvb3AsIHJk ZywgcGFydGl0aW9uLCBzdG10X2luX2FsbF9wYXJ0aXRpb25zKTsKICAgICAg IGFueV9idWlsdGluIHw9IHBhcnRpdGlvbl9idWlsdGluX3AgKHBhcnRpdGlv bik7CiAgICAgfQogCi0tIAoxLjkuMQoK --f403045e1f4204ab7d055260c440--