From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id CC41B3858420 for ; Fri, 17 Sep 2021 15:32:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CC41B3858420 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6F976101E; Fri, 17 Sep 2021 08:32:41 -0700 (PDT) Received: from [10.57.71.131] (unknown [10.57.71.131]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DA2EC3F59C; Fri, 17 Sep 2021 08:32:40 -0700 (PDT) Subject: [PATCH 2/3][vect] Consider outside costs earlier for epilogue loops To: "gcc-patches@gcc.gnu.org" Cc: Richard Sandiford , Richard Biener References: <4a2e6dde-cc5c-97fe-7a43-bd59d542c2ce@arm.com> From: "Andre Vieira (lists)" Message-ID: <4b403865-bb56-29a4-56d0-b18536925db6@arm.com> Date: Fri, 17 Sep 2021 16:32:48 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <4a2e6dde-cc5c-97fe-7a43-bd59d542c2ce@arm.com> Content-Type: multipart/mixed; boundary="------------2E747061CD171D300CAA0AC1" Content-Language: en-US X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Sep 2021 15:32:43 -0000 This is a multi-part message in MIME format. --------------2E747061CD171D300CAA0AC1 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Hi, This patch changes the order in which we check outside and inside costs for epilogue loops, this is to ensure that a predicated epilogue is more likely to be picked over an unpredicated one, since it saves having to enter a scalar epilogue loop. gcc/ChangeLog:         * tree-vect-loop.c (vect_better_loop_vinfo_p): Change how epilogue loop costs are compared. --------------2E747061CD171D300CAA0AC1 Content-Type: text/plain; charset=UTF-8; name="epilogue_costs.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="epilogue_costs.patch" ZGlmZiAtLWdpdCBhL2djYy90cmVlLXZlY3QtbG9vcC5jIGIvZ2NjL3RyZWUtdmVjdC1sb29w LmMKaW5kZXggMTRmODE1MGQ3YzI2MmI5NDIyNzg0ZTBlOTk3Y2E0Mzg3NjY0YTIwYS4uMDM4 YWYxM2E5MWQ0M2M5ZjA5MTg2ZDA0MmNmNDE1MDIwZWE3M2EzOCAxMDA2NDQKLS0tIGEvZ2Nj L3RyZWUtdmVjdC1sb29wLmMKKysrIGIvZ2NjL3RyZWUtdmVjdC1sb29wLmMKQEAgLTI4ODEs MTcgKzI4ODEsNzUgQEAgdmVjdF9iZXR0ZXJfbG9vcF92aW5mb19wIChsb29wX3ZlY19pbmZv IG5ld19sb29wX3ZpbmZvLAogCXJldHVybiBuZXdfc2ltZGxlbl9wOwogICAgIH0KIAorICBs b29wX3ZlY19pbmZvIG1haW5fbG9vcCA9IExPT1BfVklORk9fT1JJR19MT09QX0lORk8gKG9s ZF9sb29wX3ZpbmZvKTsKKyAgaWYgKG1haW5fbG9vcCkKKyAgICB7CisgICAgICBwb2x5X3Vp bnQ2NCBtYWluX3BvbHlfdmYgPSBMT09QX1ZJTkZPX1ZFQ1RfRkFDVE9SIChtYWluX2xvb3Ap OworICAgICAgdW5zaWduZWQgSE9TVF9XSURFX0lOVCBtYWluX3ZmOworICAgICAgdW5zaWdu ZWQgSE9TVF9XSURFX0lOVCBvbGRfZmFjdG9yLCBuZXdfZmFjdG9yLCBvbGRfY29zdCwgbmV3 X2Nvc3Q7CisgICAgICAvKiBJZiB3ZSBjYW4gZGV0ZXJtaW5lIGhvdyBtYW55IGl0ZXJhdGlv bnMgYXJlIGxlZnQgZm9yIHRoZSBlcGlsb2d1ZQorCSBsb29wLCB0aGF0IGlzIGlmIGJvdGgg dGhlIG1haW4gbG9vcCdzIHZlY3Rvcml6YXRpb24gZmFjdG9yIGFuZCBudW1iZXIKKwkgb2Yg aXRlcmF0aW9ucyBhcmUgY29uc3RhbnQsIHRoZW4gd2UgdXNlIHRoZW0gdG8gY2FsY3VsYXRl IHRoZSBjb3N0IG9mCisJIHRoZSBlcGlsb2d1ZSBsb29wIHRvZ2V0aGVyIHdpdGggYSAnbGlr ZWx5IHZhbHVlJyBmb3IgdGhlIGVwaWxvZ3VlcworCSB2ZWN0b3JpemF0aW9uIGZhY3Rvci4g IE90aGVyd2lzZSB3ZSB1c2UgdGhlIG1haW4gbG9vcCdzIHZlY3Rvcml6YXRpb24KKwkgZmFj dG9yIGFuZCB0aGUgbWF4aW11bSBwb2x5IHZhbHVlIGZvciB0aGUgZXBpbG9ndWUncy4gIElm IHRoZSB0YXJnZXQKKwkgaGFzIG5vdCBwcm92aWRlZCB3aXRoIGEgc2Vuc2libGUgdXBwZXIg Ym91bmQgcG9seSB2ZWN0b3JpemF0aW9uCisJIGZhY3RvcnMgYXJlIGxpa2VseSB0byBiZSBm YXZvcmVkIG92ZXIgY29uc3RhbnQgb25lcy4gICovCisgICAgICBpZiAobWFpbl9wb2x5X3Zm LmlzX2NvbnN0YW50ICgmbWFpbl92ZikKKwkgICYmIExPT1BfVklORk9fTklURVJTX0tOT1dO X1AgKG1haW5fbG9vcCkpCisJeworCSAgdW5zaWduZWQgSE9TVF9XSURFX0lOVCBuaXRlcnMK KwkgICAgPSBMT09QX1ZJTkZPX0lOVF9OSVRFUlMgKG1haW5fbG9vcCkgJSBtYWluX3ZmOwor CSAgSE9TVF9XSURFX0lOVCBvbGRfbGlrZWx5X3ZmCisJICAgID0gZXN0aW1hdGVkX3BvbHlf dmFsdWUgKG9sZF92ZiwgUE9MWV9WQUxVRV9MSUtFTFkpOworCSAgSE9TVF9XSURFX0lOVCBu ZXdfbGlrZWx5X3ZmCisJICAgID0gZXN0aW1hdGVkX3BvbHlfdmFsdWUgKG5ld192ZiwgUE9M WV9WQUxVRV9MSUtFTFkpOworCisJICAvKiBJZiB0aGUgZXBpbG9ndWUgaXMgdXNpbmcgcGFy dGlhbCB2ZWN0b3JzIHdlIGFjY291bnQgZm9yIHRoZQorCSAgICAgcGFydGlhbCBpdGVyYXRp b24gaGVyZSB0b28uICAqLworCSAgb2xkX2ZhY3RvciA9IG5pdGVycyAvIG9sZF9saWtlbHlf dmY7CisJICBpZiAoTE9PUF9WSU5GT19VU0lOR19QQVJUSUFMX1ZFQ1RPUlNfUCAob2xkX2xv b3BfdmluZm8pCisJICAgICAgJiYgbml0ZXJzICUgb2xkX2xpa2VseV92ZiAhPSAwKQorCSAg ICBvbGRfZmFjdG9yKys7CisKKwkgIG5ld19mYWN0b3IgPSBuaXRlcnMgLyBuZXdfbGlrZWx5 X3ZmOworCSAgaWYgKExPT1BfVklORk9fVVNJTkdfUEFSVElBTF9WRUNUT1JTX1AgKG5ld19s b29wX3ZpbmZvKQorCSAgICAgICYmIG5pdGVycyAlIG5ld19saWtlbHlfdmYgIT0gMCkKKwkg ICAgbmV3X2ZhY3RvcisrOworCX0KKyAgICAgIGVsc2UKKwl7CisJICB1bnNpZ25lZCBIT1NU X1dJREVfSU5UIG1haW5fdmZfbWF4CisJICAgID0gZXN0aW1hdGVkX3BvbHlfdmFsdWUgKG1h aW5fcG9seV92ZiwgUE9MWV9WQUxVRV9NQVgpOworCisJICBvbGRfZmFjdG9yID0gbWFpbl92 Zl9tYXggLyBlc3RpbWF0ZWRfcG9seV92YWx1ZSAob2xkX3ZmLAorCQkJCQkJCSAgIFBPTFlf VkFMVUVfTUFYKTsKKwkgIG5ld19mYWN0b3IgPSBtYWluX3ZmX21heCAvIGVzdGltYXRlZF9w b2x5X3ZhbHVlIChuZXdfdmYsCisJCQkJCQkJICAgUE9MWV9WQUxVRV9NQVgpOworCisJICAv KiBJZiB0aGUgbG9vcCBpcyBub3QgdXNpbmcgcGFydGlhbCB2ZWN0b3JzIHRoZW4gaXQgd2ls bCBpdGVyYXRlIG9uZQorCSAgICAgdGltZSBsZXNzIHRoYW4gb25lIHRoYXQgZG9lcy4gIEl0 IGlzIHNhZmUgdG8gc3VidHJhY3Qgb25lIGhlcmUsCisJICAgICBiZWNhdXNlIHRoZSBtYWlu IGxvb3AncyB2ZiBpcyBhbHdheXMgYXQgbGVhc3QgMnggYmlnZ2VyIHRoYW4gdGhhdAorCSAg ICAgb2YgYW4gZXBpbG9ndWUuICAqLworCSAgaWYgKCFMT09QX1ZJTkZPX1VTSU5HX1BBUlRJ QUxfVkVDVE9SU19QIChvbGRfbG9vcF92aW5mbykpCisJICAgIG9sZF9mYWN0b3IgLT0gMTsK KwkgIGlmICghTE9PUF9WSU5GT19VU0lOR19QQVJUSUFMX1ZFQ1RPUlNfUCAobmV3X2xvb3Bf dmluZm8pKQorCSAgICBuZXdfZmFjdG9yIC09IDE7CisJfQorCisgICAgICAvKiBDb21wdXRl IHRoZSBjb3N0cyBieSBtdWx0aXBseWluZyB0aGUgaW5zaWRlIGNvc3RzIHdpdGggdGhlIGZh Y3RvciBhbmQKKwkgYWRkIHRoZSBvdXRzaWRlIGNvc3RzIGZvciBhIG1vcmUgY29tcGxldGUg cGljdHVyZS4gIFRoZSBmYWN0b3IgaXMgdGhlCisJIGFtb3VudCBvZiB0aW1lcyB3ZSBhcmUg ZXhwZWN0aW5nIHRvIGl0ZXJhdGUgdGhpcyBlcGlsb2d1ZS4gICovCisgICAgICBvbGRfY29z dCA9IG9sZF9sb29wX3ZpbmZvLT52ZWNfaW5zaWRlX2Nvc3QgKiBvbGRfZmFjdG9yOworICAg ICAgbmV3X2Nvc3QgPSBuZXdfbG9vcF92aW5mby0+dmVjX2luc2lkZV9jb3N0ICogbmV3X2Zh Y3RvcjsKKyAgICAgIG9sZF9jb3N0ICs9IG9sZF9sb29wX3ZpbmZvLT52ZWNfb3V0c2lkZV9j b3N0OworICAgICAgbmV3X2Nvc3QgKz0gbmV3X2xvb3BfdmluZm8tPnZlY19vdXRzaWRlX2Nv c3Q7CisgICAgICByZXR1cm4gbmV3X2Nvc3QgPCBvbGRfY29zdDsKKyAgICB9CisKICAgLyog TGltaXQgdGhlIFZGcyB0byB3aGF0IGlzIGxpa2VseSB0byBiZSB0aGUgbWF4aW11bSBudW1i ZXIgb2YgaXRlcmF0aW9ucywKICAgICAgdG8gaGFuZGxlIGNhc2VzIGluIHdoaWNoIGF0IGxl YXN0IG9uZSBsb29wX3ZpbmZvIGlzIGZ1bGx5LW1hc2tlZC4gICovCi0gIEhPU1RfV0lERV9J TlQgZXN0aW1hdGVkX21heF9uaXRlcjsKLSAgbG9vcF92ZWNfaW5mbyBtYWluX2xvb3AgPSBM T09QX1ZJTkZPX09SSUdfTE9PUF9JTkZPIChvbGRfbG9vcF92aW5mbyk7Ci0gIHVuc2lnbmVk IEhPU1RfV0lERV9JTlQgbWFpbl92ZjsKLSAgaWYgKG1haW5fbG9vcAotICAgICAgJiYgTE9P UF9WSU5GT19OSVRFUlNfS05PV05fUCAobWFpbl9sb29wKQotICAgICAgJiYgTE9PUF9WSU5G T19WRUNUX0ZBQ1RPUiAobWFpbl9sb29wKS5pc19jb25zdGFudCAoJm1haW5fdmYpKQotICAg IGVzdGltYXRlZF9tYXhfbml0ZXIgPSBMT09QX1ZJTkZPX0lOVF9OSVRFUlMgKG1haW5fbG9v cCkgJSBtYWluX3ZmOwotICBlbHNlCi0gICAgZXN0aW1hdGVkX21heF9uaXRlciA9IGxpa2Vs eV9tYXhfc3RtdF9leGVjdXRpb25zX2ludCAobG9vcCk7CisgIEhPU1RfV0lERV9JTlQgZXN0 aW1hdGVkX21heF9uaXRlciA9IGxpa2VseV9tYXhfc3RtdF9leGVjdXRpb25zX2ludCAobG9v cCk7CiAgIGlmIChlc3RpbWF0ZWRfbWF4X25pdGVyICE9IC0xKQogICAgIHsKICAgICAgIGlm IChrbm93bl9sZSAoZXN0aW1hdGVkX21heF9uaXRlciwgbmV3X3ZmKSkK --------------2E747061CD171D300CAA0AC1--