From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 110783 invoked by alias); 8 Jun 2015 10:27:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 110760 invoked by uid 89); 8 Jun 2015 10:27:57 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-ob0-f182.google.com Received: from mail-ob0-f182.google.com (HELO mail-ob0-f182.google.com) (209.85.214.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 08 Jun 2015 10:27:56 +0000 Received: by obbqz1 with SMTP id qz1so76788700obb.3 for ; Mon, 08 Jun 2015 03:27:54 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.202.73.73 with SMTP id w70mr13112781oia.102.1433759274319; Mon, 08 Jun 2015 03:27:54 -0700 (PDT) Received: by 10.202.58.4 with HTTP; Mon, 8 Jun 2015 03:27:54 -0700 (PDT) Date: Mon, 08 Jun 2015 10:43:00 -0000 Message-ID: Subject: [PATCH] Yet another simple fix to enhance outer-loop vectorization. From: Yuri Rumyantsev To: gcc-patches , Richard Biener , Igor Zamyatin Content-Type: multipart/mixed; boundary=001a113dec463d7a580517ff18f8 X-SW-Source: 2015-06/txt/msg00547.txt.bz2 --001a113dec463d7a580517ff18f8 Content-Type: text/plain; charset=UTF-8 Content-length: 1036 Hi All, Here is a simple fix which allows duplication of outer loops to perform peeling for number of iterations if outer loop is marked with pragma omp simd. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2015-06-08 Yuri Rumyantsev * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument to allow renaming of PHI arguments on edges incoming from outer loop header, add corresponding check before start PHI iterator. (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool variable DUPLICATE_OUTER_LOOP and set it to true for outer loops with true force_vectorize. Set-up dominator for outer loop too. Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb. (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it was marked with force_vectorize and has restricted cfg. * tre-vect-loop.c (vect_analyze_loop_2): Prohibit alignment peeling for outer loops. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-outer-simd-2.c: New test. --001a113dec463d7a580517ff18f8 Content-Type: application/octet-stream; name="patch.1" Content-Disposition: attachment; filename="patch.1" Content-Transfer-Encoding: base64 X-Attachment-Id: f_ianr8wbz0 Content-length: 8057 SW5kZXg6IHRlc3RzdWl0ZS9nY2MuZGcvdmVjdC92ZWN0LW91dGVyLXNpbWQt Mi5jCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIHRlc3RzdWl0ZS9nY2Mu ZGcvdmVjdC92ZWN0LW91dGVyLXNpbWQtMi5jCShyZXZpc2lvbiAwKQorKysg dGVzdHN1aXRlL2djYy5kZy92ZWN0L3ZlY3Qtb3V0ZXItc2ltZC0yLmMJKHdv cmtpbmcgY29weSkKQEAgLTAsMCArMSw3NSBAQAorLyogeyBkZy1yZXF1aXJl LWVmZmVjdGl2ZS10YXJnZXQgdmVjdF9zaW1kX2Nsb25lcyB9ICovCisvKiB7 IGRnLWFkZGl0aW9uYWwtb3B0aW9ucyAiLWZvcGVubXAtc2ltZCAtZmZhc3Qt bWF0aCIgfSAqLworI2luY2x1ZGUgPHN0ZGxpYi5oPgorI2luY2x1ZGUgInRy ZWUtdmVjdC5oIgorI2RlZmluZSBOIDY0CisKK2Zsb2F0ICpweCwgKnB5Owor ZmxvYXQgKnR4LCAqdHk7CitmbG9hdCAqeDEsICp6MSwgKnQxLCAqdDI7CisK K3N0YXRpYyB2b2lkIGlubGluZSBiYXIgKGNvbnN0IGZsb2F0IGN4LCBmbG9h dCBjeSwKKyAgICAgICAgICAgICAgICAgICAgICAgICBmbG9hdCAqdngsIGZs b2F0ICp2eSkKK3sKKyAgaW50IGo7CisgICAgZm9yIChqID0gMDsgaiA8IE47 ICsraikKKyAgICB7CisgICAgICAgIGNvbnN0IGZsb2F0IGR4ICA9IGN4IC0g cHhbal07CisgICAgICAgIGNvbnN0IGZsb2F0IGR5ICA9IGN5IC0gcHlbal07 CisgICAgICAgICp2eCAgICAgICAgICAgICAgIC09IGR4ICogdHhbal07Cisg ICAgICAgICp2eSAgICAgICAgICAgICAgIC09IGR5ICogdHlbal07CisgICAg fQorfQorCitfX2F0dHJpYnV0ZV9fKChub2lubGluZSwgbm9jbG9uZSkpIHZv aWQgZm9vMSAoaW50IG4pCit7CisgIGludCBpOworI3ByYWdtYSBvbXAgc2lt ZAorICBmb3IgKGk9MDsgaTxuOyBpKyspCisgICAgYmFyIChweFtpXSwgcHlb aV0sIHgxK2ksIHoxK2kpOworfQorCitfX2F0dHJpYnV0ZV9fKChub2lubGlu ZSwgbm9jbG9uZSkpIHZvaWQgZm9vMiAoaW50IG4pCit7CisgIHZvbGF0aWxl IGludCBpOworICBmb3IgKGk9MDsgaTxuOyBpKyspCisgICAgYmFyIChweFtp XSwgcHlbaV0sIHgxK2ksIHoxK2kpOworfQorCisKK2ludCBtYWluICgpCit7 CisgIGZsb2F0ICpYID0gKGZsb2F0KiltYWxsb2MgKE4gKiA4ICogc2l6ZW9m IChmbG9hdCkpOworICBpbnQgaTsKKyAgaW50IG4gPSBOIC0gMTsKKyAgY2hl Y2tfdmVjdCAoKTsKKyAgcHggPSAmWFswXTsKKyAgcHkgPSAmWFtOICogMV07 CisgIHR4ID0gJlhbTiAqIDJdOworICB0eSA9ICZYW04gKiAzXTsKKyAgeDEg PSAmWFtOICogNF07CisgIHoxID0gJlhbTiAqIDVdOworICB0MSA9ICZYW04g KiA2XTsKKyAgdDIgPSAmWFtOICogN107CisKKyAgZm9yIChpPTA7IGk8Tjsg aSsrKQorICAgIHsKKyAgICAgIHB4W2ldID0gKGZsb2F0KSAoaSsyKTsKKyAg ICAgIHR4W2ldID0gKGZsb2F0KSAoaSsxKTsKKyAgICAgIHB5W2ldID0gKGZs b2F0KSAoaSs0KTsKKyAgICAgIHR5W2ldID0gKGZsb2F0KSAoaSszKTsKKyAg ICAgIHgxW2ldID0gejFbaV0gPSAxLjBmOworICAgIH0KKyAgZm9vMSAobik7 ICAvKiB2ZWN0b3IgdmFyaWFudC4gICovCisgIGZvciAoaT0wOyBpPE47aSsr KQorICAgIHsKKyAgICAgIHQxW2ldID0geDFbaV07IHgxW2ldID0gMS4wZjsK KyAgICAgIHQyW2ldID0gejFbaV07IHoxW2ldID0gMS4wZjsKKyAgICB9Cisg IGZvbzIgKG4pOyAgLyogc2NhbGFyIHZhcmlhbnQuICAqLworICBmb3IgKGk9 MDsgaTxOOyBpKyspCisgICAgaWYgKHgxW2ldICE9IHQxW2ldIHx8IHoxW2ld ICE9IHQyW2ldKQorICAgICAgYWJvcnQgKCk7CisgIHJldHVybiAwOworfQor LyogeyBkZy1maW5hbCB7IHNjYW4tdHJlZS1kdW1wICJPVVRFUiBMT09QIFZF Q1RPUklaRUQiICJ2ZWN0IiB9IH0gKi8KSW5kZXg6IHRyZWUtdmVjdC1sb29w LW1hbmlwLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gdHJlZS12ZWN0 LWxvb3AtbWFuaXAuYwkocmV2aXNpb24gMjI0MTAwKQorKysgdHJlZS12ZWN0 LWxvb3AtbWFuaXAuYwkod29ya2luZyBjb3B5KQpAQCAtOTcsMTAgKzk3LDEy IEBACiB9CiAKIAotLyogUmVuYW1lcyB0aGUgdmFyaWFibGVzIGluIGJhc2lj IGJsb2NrIEJCLiAgKi8KKy8qIFJlbmFtZXMgdGhlIHZhcmlhYmxlcyBpbiBi YXNpYyBibG9jayBCQi4gIEFsbG93IHJlbmFtaW5nICBvZiBQSEkgYXJndW1u ZXRzCisgICBvbiBlZGdlcyBpbmNvbWluZyBmcm9tIG91dGVyLWJsb2NrIGhl YWRlciBpZiBSRU5BTUVfRlJPTV9PVVRFUl9MT09QIGlzCisgICB0cnVlLiAg Ki8KIAogc3RhdGljIHZvaWQKLXJlbmFtZV92YXJpYWJsZXNfaW5fYmIgKGJh c2ljX2Jsb2NrIGJiKQorcmVuYW1lX3ZhcmlhYmxlc19pbl9iYiAoYmFzaWNf YmxvY2sgYmIsIGJvb2wgcmVuYW1lX2Zyb21fb3V0ZXJfbG9vcCkKIHsKICAg Z2ltcGxlIHN0bXQ7CiAgIHVzZV9vcGVyYW5kX3AgdXNlX3A7CkBAIC0xMDgs NyArMTEwLDE0IEBACiAgIGVkZ2UgZTsKICAgZWRnZV9pdGVyYXRvciBlaTsK ICAgc3RydWN0IGxvb3AgKmxvb3AgPSBiYi0+bG9vcF9mYXRoZXI7CisgIHN0 cnVjdCBsb29wICpvdXRlcl9sb29wID0gTlVMTDsKIAorICBpZiAocmVuYW1l X2Zyb21fb3V0ZXJfbG9vcCkKKyAgICB7CisgICAgICBnY2NfYXNzZXJ0IChs b29wKTsKKyAgICAgIG91dGVyX2xvb3AgPSBsb29wX291dGVyIChsb29wKTsK KyAgICB9CisKICAgZm9yIChnaW1wbGVfc3RtdF9pdGVyYXRvciBnc2kgPSBn c2lfc3RhcnRfYmIgKGJiKTsgIWdzaV9lbmRfcCAoZ3NpKTsKICAgICAgICBn c2lfbmV4dCAoJmdzaSkpCiAgICAgewpAQCAtMTE5LDcgKzEyOCw4IEBACiAK ICAgRk9SX0VBQ0hfRURHRSAoZSwgZWksIGJiLT5wcmVkcykKICAgICB7Ci0g ICAgICBpZiAoIWZsb3dfYmJfaW5zaWRlX2xvb3BfcCAobG9vcCwgZS0+c3Jj KSkKKyAgICAgIGlmICghZmxvd19iYl9pbnNpZGVfbG9vcF9wIChsb29wLCBl LT5zcmMpCisJICAmJiAoIXJlbmFtZV9mcm9tX291dGVyX2xvb3AgfHwgZS0+ c3JjICE9IG91dGVyX2xvb3AtPmhlYWRlcikpCiAJY29udGludWU7CiAgICAg ICBmb3IgKGdwaGlfaXRlcmF0b3IgZ3NpID0gZ3NpX3N0YXJ0X3BoaXMgKGJi KTsgIWdzaV9lbmRfcCAoZ3NpKTsKIAkgICBnc2lfbmV4dCAoJmdzaSkpCkBA IC03NzUsNiArNzg1LDcgQEAKICAgYm9vbCB3YXNfaW1tX2RvbTsKICAgYmFz aWNfYmxvY2sgZXhpdF9kZXN0OwogICBlZGdlIGV4aXQsIG5ld19leGl0Owor ICBib29sIGR1cGxpY2F0ZV9vdXRlcl9sb29wID0gZmFsc2U7CiAKICAgZXhp dCA9IHNpbmdsZV9leGl0IChsb29wKTsKICAgYXRfZXhpdCA9IChlID09IGV4 aXQpOwpAQCAtNzg2LDcgKzc5NywxMCBAQAogCiAgIGJicyA9IFhORVdWRUMg KGJhc2ljX2Jsb2NrLCBzY2FsYXJfbG9vcC0+bnVtX25vZGVzICsgMSk7CiAg IGdldF9sb29wX2JvZHlfd2l0aF9zaXplIChzY2FsYXJfbG9vcCwgYmJzLCBz Y2FsYXJfbG9vcC0+bnVtX25vZGVzKTsKLQorICAvKiBBbGxvdyBkdXBsaWNh dGlvbiBvZiBvdXRlciBsb29wcyBpZiB0aGV5IGFyZSBtYXJrZWQgd2l0aCBw cmFnbWEKKyAgICAgb21wIHNpbWQuICAqLworICBpZiAoc2NhbGFyX2xvb3At PmZvcmNlX3ZlY3Rvcml6ZSAmJiBzY2FsYXJfbG9vcC0+aW5uZXIpCisgICAg ZHVwbGljYXRlX291dGVyX2xvb3AgPSB0cnVlOwogICAvKiBDaGVjayB3aGV0 aGVyIGR1cGxpY2F0aW9uIGlzIHBvc3NpYmxlLiAgKi8KICAgaWYgKCFjYW5f Y29weV9iYnNfcCAoYmJzLCBzY2FsYXJfbG9vcC0+bnVtX25vZGVzKSkKICAg ICB7CkBAIC04NTUsNyArODY5LDcgQEAKICAgICAgIHJlZGlyZWN0X2VkZ2Vf YW5kX2JyYW5jaF9mb3JjZSAoZSwgbmV3X3ByZWhlYWRlcik7CiAgICAgICBm bHVzaF9wZW5kaW5nX3N0bXRzIChlKTsKICAgICAgIHNldF9pbW1lZGlhdGVf ZG9taW5hdG9yIChDRElfRE9NSU5BVE9SUywgbmV3X3ByZWhlYWRlciwgZS0+ c3JjKTsKLSAgICAgIGlmICh3YXNfaW1tX2RvbSkKKyAgICAgIGlmICh3YXNf aW1tX2RvbSB8fCBkdXBsaWNhdGVfb3V0ZXJfbG9vcCkKIAlzZXRfaW1tZWRp YXRlX2RvbWluYXRvciAoQ0RJX0RPTUlOQVRPUlMsIGV4aXRfZGVzdCwgbmV3 X2V4aXQtPnNyYyk7CiAKICAgICAgIC8qIEFuZCByZW1vdmUgdGhlIG5vbi1u ZWNlc3NhcnkgZm9yd2FyZGVyIGFnYWluLiAgS2VlcCB0aGUgb3RoZXIKQEAg LTg5OCw3ICs5MTIsNyBAQAogICAgIH0KIAogICBmb3IgKHVuc2lnbmVkIGkg PSAwOyBpIDwgc2NhbGFyX2xvb3AtPm51bV9ub2RlcyArIDE7IGkrKykKLSAg ICByZW5hbWVfdmFyaWFibGVzX2luX2JiIChuZXdfYmJzW2ldKTsKKyAgICBy ZW5hbWVfdmFyaWFibGVzX2luX2JiIChuZXdfYmJzW2ldLCBkdXBsaWNhdGVf b3V0ZXJfbG9vcCk7CiAKICAgaWYgKHNjYWxhcl9sb29wICE9IGxvb3ApCiAg ICAgewpAQCAtOTg1LDcgKzk5OSwxMCBAQAogICAgKDMpIGl0IGlzIHNpbmds ZSBlbnRyeSwgc2luZ2xlIGV4aXQKICAgICg0KSBpdHMgZXhpdCBjb25kaXRp b24gaXMgdGhlIGxhc3Qgc3RtdCBpbiB0aGUgaGVhZGVyCiAgICAoNSkgRSBp cyB0aGUgZW50cnkvZXhpdCBlZGdlIG9mIExPT1AuCi0gKi8KKyAgIEFsbG93 IGR1cGxpY2F0aW9uIG9mIG91dGVyIGxvb3BzIGlmOgorICAgKDEnKSBpdCBp cyBtYXJrZWQgd2l0aCBmb3JjZV92ZWN0b3JpemUgZmxhZy4KKyAgICgyJykg aXQgY29uc2lzdHMgb2YgZXhhY3RseSA1IGJhc2ljIGJsb2Nrcy4KKyAgIE90 aGVyIGNvbmRpdGlvbnMgYXJlIHRha2VuIGFib3ZlLiAgKi8KIAogYm9vbAog c2xwZWVsX2Nhbl9kdXBsaWNhdGVfbG9vcF9wIChjb25zdCBzdHJ1Y3QgbG9v cCAqbG9vcCwgY29uc3RfZWRnZSBlKQpAQCAtOTk1LDYgKzEwMTIsMTEgQEAK ICAgZ2NvbmQgKm9yaWdfY29uZCA9IGdldF9sb29wX2V4aXRfY29uZGl0aW9u IChsb29wKTsKICAgZ2ltcGxlX3N0bXRfaXRlcmF0b3IgbG9vcF9leGl0X2dz aSA9IGdzaV9sYXN0X2JiIChleGl0X2UtPnNyYyk7CiAKKyAgaWYgKGxvb3At PmlubmVyICYmIGxvb3AtPmZvcmNlX3ZlY3Rvcml6ZSAmJiBsb29wLT5udW1f bm9kZXMgPT0gNQorICAgICAgJiYgc2luZ2xlX2V4aXQgKGxvb3ApICYmIChl ID09IGV4aXRfZSB8fCBlID09IGVudHJ5X2UpCisgICAgICAmJiBvcmlnX2Nv bmQgJiYgb3JpZ19jb25kID09IGdzaV9zdG10IChsb29wX2V4aXRfZ3NpKSkK KyAgICByZXR1cm4gdHJ1ZTsKKwogICBpZiAobG9vcC0+aW5uZXIKICAgICAg IC8qIEFsbCBsb29wcyBoYXZlIGFuIG91dGVyIHNjb3BlOyB0aGUgb25seSBj YXNlIGxvb3AtPm91dGVyIGlzIE5VTEwgaXMgZm9yCiAgICAgICAgICB0aGUg ZnVuY3Rpb24gaXRzZWxmLiAgKi8KSW5kZXg6IHRyZWUtdmVjdC1sb29wLmMK PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PQotLS0gdHJlZS12ZWN0LWxvb3AuYwko cmV2aXNpb24gMjI0MTAwKQorKysgdHJlZS12ZWN0LWxvb3AuYwkod29ya2lu ZyBjb3B5KQpAQCAtMTg3OSw2ICsxODc5LDEwIEBACiAgICAgICByZXR1cm4g ZmFsc2U7CiAgICAgfQogCisgIC8qIFBlZWxpbmcgZm9yIGFsaWdubWVudCBp cyBub3Qgc3VwcG9ydGVkIGZvciBvdXRlci1sb29wIHZlY3Rvcml6YXRpb24u ICAqLworICBpZiAoTE9PUF9WSU5GT19MT09QIChsb29wX3ZpbmZvKS0+aW5u ZXIpCisgICAgTE9PUF9WSU5GT19QRUVMSU5HX0ZPUl9BTElHTk1FTlQgKGxv b3BfdmluZm8pID0gMDsKKwogICAvKiBEZWNpZGUgd2hldGhlciB3ZSBuZWVk IHRvIGNyZWF0ZSBhbiBlcGlsb2d1ZSBsb29wIHRvIGhhbmRsZQogICAgICBy ZW1haW5pbmcgc2NhbGFyIGl0ZXJhdGlvbnMuICAqLwogICB0aCA9ICgoTE9P UF9WSU5GT19DT1NUX01PREVMX1RIUkVTSE9MRCAobG9vcF92aW5mbykgKyAx KQo= --001a113dec463d7a580517ff18f8--