From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 77118 invoked by alias); 15 Dec 2017 11:30:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 77098 invoked by uid 89); 15 Dec 2017 11:30:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.0 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=Hx-exchange-antispam-report-cfa-test:102415395, H*MI:eurprd08, H*r:15.20.0302.017, H*RU:15.20.0302.017 X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com Received: from mail-eopbgr20064.outbound.protection.outlook.com (HELO EUR02-VE1-obe.outbound.protection.outlook.com) (40.107.2.64) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 15 Dec 2017 11:30:44 +0000 Received: from DB5PR0801MB2742.eurprd08.prod.outlook.com (10.166.176.26) by DB5PR0801MB2741.eurprd08.prod.outlook.com (10.166.176.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.302.9; Fri, 15 Dec 2017 11:30:40 +0000 Received: from DB5PR0801MB2742.eurprd08.prod.outlook.com ([10.166.176.26]) by DB5PR0801MB2742.eurprd08.prod.outlook.com ([10.166.176.26]) with mapi id 15.20.0302.017; Fri, 15 Dec 2017 11:30:40 +0000 From: Bin Cheng To: "gcc-patches@gcc.gnu.org" CC: nd Subject: [PATCH PR81740]Enforce dependence check for outer loop vectorization Date: Fri, 15 Dec 2017 11:30:00 -0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Bin.Cheng@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DB5PR0801MB2741;6:rQUNrRcLJ0lWO16dw1fWJMGN8MiM6r61HQnxilTqX4pk12oA+IR2+F5n/8wykNnvibIL9h21p1c0z4bnxeSqVPDnyejHaL1dB0fBOS5LDxmyrh4aFuAR56hThzR6dd9kalplzG2tv26Bt2kMKOY8sBtAu5yhsQXtlxJXViXsV762L4Y7Be1jRAIqq4XyiJpNP4fYkbzGJydMXvQowITmS4AtlEiBSed1yQ8XjzNky1skNWx8yFgtwWheVnitQPr/aHvJSJ2b8imjokmzp8hbeaJFPgs+2QNKK1tuElYat15/Q90Z9ajyaHTkQo/mlM6i2jsdU5cUS/HXTbZzA/tVljqohnCV/vJWN4qmomdTpk8=;5:LRMlJ/vog9pJnzPUPt35Ms4dcKeZlQK+e1BrDgv47CQvW0wNOBpg9uWzCaXgKn0cDHT5lg9gwzfJtlcsuEgjwsYYAdrfByAyaM9m3H7kLHlOOyM4+nuBgkVT1BSgAoYtd9sBmqYN07TS+R5TV/FGtTNAZE+uyh3ItyBy+S9aI0c=;24:82TGc0OWD4ucPl88H6tiN575P1w6XnOxptqNENQ6dIf3TncRjVRtO75aZWWKJzNIq9EJu5sYjD6ywB6dTP5UwY+P8Hp0r9Cnjkr98Ne4K8Q=;7:yqiyl9lVt2fWzhqpwE314JU2FlcrMZGNP2KSqWRcWW2JSxSLlYLdFFqrOW4ej910GFe1RnLc0jCBIO6xyqYbyfgs/IckYnTqOEA2TjDFUIM3S/RVfkFZluLI16C5ScpQVM6etZp3GE3hk1B/fa/oF9zW5l9hddZK66p7XRBmZ201hMDcsi8nb6Slimn0pLkygsqIcCXlUpLak23NsfPTp3erhWj8OJDRaS+oTGimXJz/ro3XoLK+PuQ1HsnK3YM1 x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: 922dfc25-0516-4a38-074c-08d543af4525 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(5600026)(4604075)(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(48565401081)(2017052603307)(49563074);SRVR:DB5PR0801MB2741; x-ms-traffictypediagnostic: DB5PR0801MB2741: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(102415395)(6040450)(2401047)(5005006)(8121501046)(93006095)(93001095)(3231023)(3002001)(10201501046)(6055026)(6041248)(20161123558100)(20161123555025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123564025)(20161123562025)(6072148)(201708071742011);SRVR:DB5PR0801MB2741;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:DB5PR0801MB2741; x-forefront-prvs: 05220145DE x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(346002)(39860400002)(396003)(366004)(376002)(189003)(199004)(377424004)(305945005)(66066001)(7736002)(14454004)(8676002)(81156014)(81166006)(6916009)(74316002)(55016002)(77096006)(5640700003)(6506007)(59450400001)(6436002)(72206003)(86362001)(316002)(478600001)(99936001)(8936002)(97736004)(99286004)(53936002)(3660700001)(4001150100001)(9686003)(3280700002)(6116002)(33656002)(68736007)(2906002)(3846002)(2900100001)(102836003)(5660300001)(2351001)(25786009)(2501003)(106356001)(4326008)(105586002)(7696005);DIR:OUT;SFP:1101;SCL:1;SRVR:DB5PR0801MB2741;H:DB5PR0801MB2742.eurprd08.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: multipart/mixed; boundary="_002_DB5PR0801MB27422285F855F93CFAEA210BE70B0DB5PR0801MB2742_" MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 922dfc25-0516-4a38-074c-08d543af4525 X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Dec 2017 11:30:39.9935 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR0801MB2741 X-IsSubscribed: yes X-SW-Source: 2017-12/txt/msg01031.txt.bz2 --_002_DB5PR0801MB27422285F855F93CFAEA210BE70B0DB5PR0801MB2742_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-length: 2432 Hi, As explained in the PR, given below test case: int a[8][10] =3D { [2][5] =3D 4 }, c; int main () { short b; int i, d; for (b =3D 4; b >=3D 0; b--) for (c =3D 0; c <=3D 6; c++) a[c + 1][b + 2] =3D a[c][b + 1]; for (i =3D 0; i < 8; i++) for (d =3D 0; d < 10; d++) if (a[i][d] !=3D (i =3D=3D 3 && d =3D=3D 6) * 4) __builtin_abort (); return 0; the loop nest is illegal for vectorization without reversing inner loop. T= he issue is in data dependence checking of vectorizer, I believe the mentioned revis= ion just exposed this. Previously the vectorization is skipped because of unsupport= ed memory operation. The outer loop vectorization unrolls the outer loop into: for (b =3D 4; b > 0; b -=3D 4) { for (c =3D 0; c <=3D 6; c++) a[c + 1][6] =3D a[c][5]; for (c =3D 0; c <=3D 6; c++) a[c + 1][5] =3D a[c][4]; for (c =3D 0; c <=3D 6; c++) a[c + 1][4] =3D a[c][3]; for (c =3D 0; c <=3D 6; c++) a[c + 1][3] =3D a[c][2]; } Then four inner loops are fused into: for (b =3D 4; b > 0; b -=3D 4) { for (c =3D 0; c <=3D 6; c++) { a[c + 1][6] =3D a[c][5]; // S1 a[c + 1][5] =3D a[c][4]; // S2 a[c + 1][4] =3D a[c][3]; a[c + 1][3] =3D a[c][2]; } } The loop fusion needs to meet the dependence requirement. Basically, GCC's= data dependence analyzer does not model dep between references in sibling loops,= but in practice, fusion requirement can be checked by analyzing all data refere= nces after fusion, and there is no backward data dependence. Apparently, the requirement is violated because we have backward data depen= dence between references (a[c][5], a[c+1][5]) in S1/S2. Note, if we reverse the = inner loop, the outer loop would become legal for vectorization. This patch fixes the issue by enforcing dependence check. It also adds two= tests with one shouldn't be vectorized and the other should. Bootstrap and test = on x86_64 and AArch64. Is it OK? Thanks, bin 2017-12-15 Bin Cheng PR tree-optimization/81740 * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): In case of outer loop vectorization, check backward dependence at inner loop if dependence at outer loop is reversed. gcc/testsuite 2017-12-15 Bin Cheng PR tree-optimization/81740 * gcc.dg/vect/pr81740-1.c: New test. * gcc.dg/vect/pr81740-2.c: Refine test.= --_002_DB5PR0801MB27422285F855F93CFAEA210BE70B0DB5PR0801MB2742_ Content-Type: text/plain; name="pr81740-20171212.txt" Content-Description: pr81740-20171212.txt Content-Disposition: attachment; filename="pr81740-20171212.txt"; size=2869; creation-date="Fri, 15 Dec 2017 11:29:31 GMT"; modification-date="Fri, 15 Dec 2017 11:29:31 GMT" Content-Transfer-Encoding: base64 Content-length: 3892 RnJvbSBjMGM4Y2ZhZTA4YzBiZGUyY2VjNDFhOGQzYWJjYmZlYTBiZDJlMjEx IE1vbiBTZXAgMTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBCaW4gQ2hlbmcgPGJp bmNoZTAxQGUxMDg0NTEtbGluLmNhbWJyaWRnZS5hcm0uY29tPgpEYXRlOiBU aHUsIDE0IERlYyAyMDE3IDE1OjMyOjAyICswMDAwClN1YmplY3Q6IFtQQVRD SF0gcHI4MTc0MC0yMDE3MTIxMi50eHQKCi0tLQogZ2NjL3Rlc3RzdWl0ZS9n Y2MuZGcvdmVjdC9wcjgxNzQwLTEuYyB8IDE3ICsrKysrKysrKysrKysrKysr CiBnY2MvdGVzdHN1aXRlL2djYy5kZy92ZWN0L3ByODE3NDAtMi5jIHwgMjEg KysrKysrKysrKysrKysrKysrKysrCiBnY2MvdHJlZS12ZWN0LWRhdGEtcmVm cy5jICAgICAgICAgICAgIHwgMTEgKysrKysrKysrKysKIDMgZmlsZXMgY2hh bmdlZCwgNDkgaW5zZXJ0aW9ucygrKQogY3JlYXRlIG1vZGUgMTAwNjQ0IGdj Yy90ZXN0c3VpdGUvZ2NjLmRnL3ZlY3QvcHI4MTc0MC0xLmMKIGNyZWF0ZSBt b2RlIDEwMDY0NCBnY2MvdGVzdHN1aXRlL2djYy5kZy92ZWN0L3ByODE3NDAt Mi5jCgpkaWZmIC0tZ2l0IGEvZ2NjL3Rlc3RzdWl0ZS9nY2MuZGcvdmVjdC9w cjgxNzQwLTEuYyBiL2djYy90ZXN0c3VpdGUvZ2NjLmRnL3ZlY3QvcHI4MTc0 MC0xLmMKbmV3IGZpbGUgbW9kZSAxMDA2NDQKaW5kZXggMDAwMDAwMC4uZDkw YWJhNQotLS0gL2Rldi9udWxsCisrKyBiL2djYy90ZXN0c3VpdGUvZ2NjLmRn L3ZlY3QvcHI4MTc0MC0xLmMKQEAgLTAsMCArMSwxNyBAQAorLyogeyBkZy1k byBydW4gfSAqLworaW50IGFbOF1bMTBdID0geyBbMl1bNV0gPSA0IH0sIGM7 CisKK2ludAorbWFpbiAoKQoreworICBzaG9ydCBiOworICBpbnQgaSwgZDsK KyAgZm9yIChiID0gNDsgYiA+PSAwOyBiLS0pCisgICAgZm9yIChjID0gMDsg YyA8PSA2OyBjKyspCisgICAgICBhW2MgKyAxXVtiICsgMl0gPSBhW2NdW2Ig KyAxXTsKKyAgZm9yIChpID0gMDsgaSA8IDg7IGkrKykKKyAgICBmb3IgKGQg PSAwOyBkIDwgMTA7IGQrKykKKyAgICAgIGlmIChhW2ldW2RdICE9IChpID09 IDMgJiYgZCA9PSA2KSAqIDQpCisgICAgICAgIF9fYnVpbHRpbl9hYm9ydCAo KTsKKyAgcmV0dXJuIDA7Cit9CmRpZmYgLS1naXQgYS9nY2MvdGVzdHN1aXRl L2djYy5kZy92ZWN0L3ByODE3NDAtMi5jIGIvZ2NjL3Rlc3RzdWl0ZS9nY2Mu ZGcvdmVjdC9wcjgxNzQwLTIuYwpuZXcgZmlsZSBtb2RlIDEwMDY0NAppbmRl eCAwMDAwMDAwLi5mYjViMzAwCi0tLSAvZGV2L251bGwKKysrIGIvZ2NjL3Rl c3RzdWl0ZS9nY2MuZGcvdmVjdC9wcjgxNzQwLTIuYwpAQCAtMCwwICsxLDIx IEBACisvKiB7IGRnLWRvIHJ1biB9ICovCisvKiB7IGRnLXJlcXVpcmUtZWZm ZWN0aXZlLXRhcmdldCB2ZWN0X2ludCB9ICovCisKK2ludCBhWzhdWzEwXSA9 IHsgWzJdWzVdID0gNCB9LCBjOworCitpbnQKK21haW4gKCkKK3sKKyAgc2hv cnQgYjsKKyAgaW50IGksIGQ7CisgIGZvciAoYiA9IDQ7IGIgPj0gMDsgYi0t KQorICAgIGZvciAoYyA9IDY7IGMgPj0gMDsgYy0tKQorICAgICAgYVtjICsg MV1bYiArIDJdID0gYVtjXVtiICsgMV07CisgIGZvciAoaSA9IDA7IGkgPCA4 OyBpKyspCisgICAgZm9yIChkID0gMDsgZCA8IDEwOyBkKyspCisgICAgICBp ZiAoYVtpXVtkXSAhPSAoaSA9PSAzICYmIGQgPT0gNikgKiA0KQorICAgICAg ICBfX2J1aWx0aW5fYWJvcnQgKCk7CisgIHJldHVybiAwOworfQorCisvKiB7 IGRnLWZpbmFsIHsgc2Nhbi10cmVlLWR1bXAtdGltZXMgIk9VVEVSIExPT1Ag VkVDVE9SSVpFRCIgMSAidmVjdCIgIH0gfSAqLwpkaWZmIC0tZ2l0IGEvZ2Nj L3RyZWUtdmVjdC1kYXRhLXJlZnMuYyBiL2djYy90cmVlLXZlY3QtZGF0YS1y ZWZzLmMKaW5kZXggOTk2ZDE1Ni4uM2I3ODBjZjEgMTAwNjQ0Ci0tLSBhL2dj Yy90cmVlLXZlY3QtZGF0YS1yZWZzLmMKKysrIGIvZ2NjL3RyZWUtdmVjdC1k YXRhLXJlZnMuYwpAQCAtNDM1LDYgKzQzNSwxNyBAQCB2ZWN0X2FuYWx5emVf ZGF0YV9yZWZfZGVwZW5kZW5jZSAoc3RydWN0IGRhdGFfZGVwZW5kZW5jZV9y ZWxhdGlvbiAqZGRyLAogCSAgaWYgKGR1bXBfZW5hYmxlZF9wICgpKQogCSAg ICBkdW1wX3ByaW50Zl9sb2MgKE1TR19NSVNTRURfT1BUSU1JWkFUSU9OLCB2 ZWN0X2xvY2F0aW9uLAogCSAgICAgICAgICAgICAgICAgICAgICJkZXBlbmRl bmNlIGRpc3RhbmNlIG5lZ2F0aXZlLlxuIik7CisJICAvKiBXaGVuIGRvaW5n IG91dGVyIGxvb3AgdmVjdG9yaXphdGlvbiwgd2UgbmVlZCB0byBjaGVjayBp ZiB0aGVyZSBpcworCSAgICAgYmFja3dhcmQgZGVwZW5kZW5jZSBhdCBpbm5l ciBsb29wIGxldmVsIGlmIGRlcGVuZGVuY2UgYXQgdGhlIG91dGVyCisJICAg ICBsb29wIGlzIHJldmVyc2VkLiAgU2VlIFBSODE3NDAgZm9yIG1vcmUgaW5m b3JtYXRpb24uICAqLworCSAgaWYgKG5lc3RlZF9pbl92ZWN0X2xvb3BfcCAo bG9vcCwgRFJfU1RNVCAoZHJhKSkKKwkgICAgICB8fCBuZXN0ZWRfaW5fdmVj dF9sb29wX3AgKGxvb3AsIERSX1NUTVQgKGRyYikpKQorCSAgICB7CisJICAg ICAgdW5zaWduZWQgaW5uZXJfZGVwdGggPSBpbmRleF9pbl9sb29wX25lc3Qg KGxvb3AtPmlubmVyLT5udW0sCisJCQkJCQkJIEREUl9MT09QX05FU1QgKGRk cikpOworCSAgICAgIGlmIChkaXN0X3ZbaW5uZXJfZGVwdGhdIDwgMCkKKwkJ cmV0dXJuIHRydWU7CisJICAgIH0KIAkgIC8qIFJlY29yZCBhIG5lZ2F0aXZl IGRlcGVuZGVuY2UgZGlzdGFuY2UgdG8gbGF0ZXIgbGltaXQgdGhlCiAJICAg ICBhbW91bnQgb2Ygc3RtdCBjb3B5aW5nIC8gdW5yb2xsaW5nIHdlIGNhbiBw ZXJmb3JtLgogCSAgICAgT25seSBuZWVkIHRvIGhhbmRsZSByZWFkLWFmdGVy LXdyaXRlIGRlcGVuZGVuY2UuICAqLwotLSAKMS45LjEKCg== --_002_DB5PR0801MB27422285F855F93CFAEA210BE70B0DB5PR0801MB2742_--