From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by sourceware.org (Postfix) with ESMTPS id D2F36384C00B for ; Fri, 21 May 2021 14:04:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D2F36384C00B Received: by mail-qk1-x734.google.com with SMTP id k4so8686418qkd.0 for ; Fri, 21 May 2021 07:04:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=+p+alf2EfGfS2ZtaSO5YffaQLZKgerYGTTBj1svN5R4=; b=Vim93OekGDU5M6GqtJR3eRWJ9U7TJPVa/c6lw4Kbdbt0U51WnWXvRxEPNIX3UBxe0f D4tUD3YmUj+E3FGbT7xJCjNTcoU99Wx9lupQQOrITzxu+NYu2z9FX/itmpbmMgxDALez e11KFkrryxN4iWfQ/eqz5M2IEpK7wA2qym2LWc2JLCSxWBBdbnQWUpGT/QjtE8K+JUlf Siz/fZzF6MEFjwC+LozF+5Lj0DxJ1g5rURocXW5/zwzqh/XQaGJVRHtaIr8ccORAE0SR 1Q2oCbPWZxTQnf0xTns9g71l1JEfuKEVFtz1TpybL04VZ2yG5XC9VlzAjhfV2vfsuSXO jydg== X-Gm-Message-State: AOAM531PPoL2Uk4KdjnYZyLbM5An6sKtnoCBM9uOUvOD3P061An+VZXW IusS/VNiKW4520U/1yZeAYg1GzkmOZA+0GRL6MfyxI47YvjjYg== X-Google-Smtp-Source: ABdhPJykH5u9aiBrNuYzQgZEkdScbeyy4Z6DMle1q1khUfTmCPaeB7vbqPJTW3/wc9pJgujjxeMbl1NZEY8qNkTRT0A= X-Received: by 2002:a05:620a:2215:: with SMTP id m21mr12821050qkh.61.1621605855846; Fri, 21 May 2021 07:04:15 -0700 (PDT) MIME-Version: 1.0 From: Uros Bizjak Date: Fri, 21 May 2021 16:04:04 +0200 Message-ID: Subject: [RFC PATCH] i386: Enable auto-vectorization for 32bit modes (+ testcases) To: "gcc-patches@gcc.gnu.org" Content-Type: multipart/mixed; boundary="00000000000001281905c2d78938" X-Spam-Status: No, score=-8.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 May 2021 14:04:18 -0000 --00000000000001281905c2d78938 Content-Type: text/plain; charset="UTF-8" Here it is, the patch that enables auto-vectorization for 32bit modes. Sent as RFC, because the patch fails some vectorizer scans, as it obviously enables more vectorization to happen: Running target unix FAIL: gcc.dg/vect/pr71264.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops in function" FAIL: gcc.dg/vect/pr71264.c scan-tree-dump vect "vectorized 1 loops in function" FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 3 loops" 1 FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 3 FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorized 3 loops" 1 FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorizing stmts using SLP" 3 Running target unix/-m32 FAIL: gcc.dg/vect/no-vfa-vect-101.c scan-tree-dump-times vect "can't determine dependence" 1 FAIL: gcc.dg/vect/no-vfa-vect-102.c scan-tree-dump-times vect "possible dependence between data-refs" 1 FAIL: gcc.dg/vect/no-vfa-vect-102a.c scan-tree-dump-times vect "possible dependence between data-refs" 1 FAIL: gcc.dg/vect/no-vfa-vect-37.c scan-tree-dump-times vect "can't determine dependence" 2 FAIL: gcc.dg/vect/pr71264.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops in function" FAIL: gcc.dg/vect/pr71264.c scan-tree-dump vect "vectorized 1 loops in function" FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 3 loops" 1 FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 3 FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorized 3 loops" 1 FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorizing stmts using SLP" 3 FAIL: gcc.dg/vect/vect-104.c -flto -ffat-lto-objects scan-tree-dump-times vect "possible dependence between data-refs" 1 FAIL: gcc.dg/vect/vect-104.c scan-tree-dump-times vect "possible dependence between data-refs" 1 Please also note that V4QI and V2HI modes do not use MMX registers, so auto-vectorization can also be enabled on 32bit x86 targets. Uros. --00000000000001281905c2d78938 Content-Type: text/plain; charset="US-ASCII"; name="p.diff.txt" Content-Disposition: attachment; filename="p.diff.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_koye8qcc0 ZGlmZiAtLWdpdCBhL2djYy9jb25maWcvaTM4Ni9pMzg2LmMgYi9nY2MvY29uZmlnL2kzODYvaTM4 Ni5jCmluZGV4IGYzYjQ1MTgzNWRhLi5mNDNmM2JhMDYwZSAxMDA2NDQKLS0tIGEvZ2NjL2NvbmZp Zy9pMzg2L2kzODYuYworKysgYi9nY2MvY29uZmlnL2kzODYvaTM4Ni5jCkBAIC0yMjE4NywxMiAr MjIxODcsMTUgQEAgaXg4Nl9hdXRvdmVjdG9yaXplX3ZlY3Rvcl9tb2RlcyAodmVjdG9yX21vZGVz ICptb2RlcywgYm9vbCBhbGwpCiAgICAgICBtb2Rlcy0+c2FmZV9wdXNoIChWMTZRSW1vZGUpOwog ICAgICAgbW9kZXMtPnNhZmVfcHVzaCAoVjMyUUltb2RlKTsKICAgICB9Ci0gIGVsc2UgaWYgKFRB UkdFVF9NTVhfV0lUSF9TU0UpCisgIGVsc2UgaWYgKFRBUkdFVF9TU0UyKQogICAgIG1vZGVzLT5z YWZlX3B1c2ggKFYxNlFJbW9kZSk7CiAKICAgaWYgKFRBUkdFVF9NTVhfV0lUSF9TU0UpCiAgICAg bW9kZXMtPnNhZmVfcHVzaCAoVjhRSW1vZGUpOwogCisgIGlmIChUQVJHRVRfU1NFMikKKyAgICBt b2Rlcy0+c2FmZV9wdXNoIChWNFFJbW9kZSk7CisKICAgcmV0dXJuIDA7CiB9CiAKZGlmZiAtLWdp dCBhL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9pMzg2L3ByMTAwNjM3LTNiLmMgYi9nY2MvdGVz dHN1aXRlL2djYy50YXJnZXQvaTM4Ni9wcjEwMDYzNy0zYi5jCm5ldyBmaWxlIG1vZGUgMTAwNjQ0 CmluZGV4IDAwMDAwMDAwMDAwLi4xNmRmNzAwNTlhOQotLS0gL2Rldi9udWxsCisrKyBiL2djYy90 ZXN0c3VpdGUvZ2NjLnRhcmdldC9pMzg2L3ByMTAwNjM3LTNiLmMKQEAgLTAsMCArMSw1NiBAQAor LyogUFIgdGFyZ2V0LzEwMDYzNyAqLworLyogeyBkZy1kbyBjb21waWxlIH0gKi8KKy8qIHsgZGct b3B0aW9ucyAiLU8yIC1mdHJlZS12ZWN0b3JpemUgLW1zc2U0IiB9ICovCisKK2NoYXIgcls0XSwg YVs0XSwgYls0XTsKK3Vuc2lnbmVkIGNoYXIgdXJbNF0sIHVhWzRdLCB1Yls0XTsKKwordm9pZCBt YXhzICh2b2lkKQoreworICBpbnQgaTsKKworICBmb3IgKGkgPSAwOyBpIDwgNDsgaSsrKQorICAg IHJbaV0gPSBhW2ldID4gYltpXSA/IGFbaV0gOiBiW2ldOworfQorCisvKiB7IGRnLWZpbmFsIHsg c2Nhbi1hc3NlbWJsZXIgInBtYXhzYiIgfSB9ICovCisKK3ZvaWQgbWF4dSAodm9pZCkKK3sKKyAg aW50IGk7CisKKyAgZm9yIChpID0gMDsgaSA8IDQ7IGkrKykKKyAgICB1cltpXSA9IHVhW2ldID4g dWJbaV0gPyB1YVtpXSA6IHViW2ldOworfQorCisvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJs ZXIgInBtYXh1YiIgfSB9ICovCisKK3ZvaWQgbWlucyAodm9pZCkKK3sKKyAgaW50IGk7CisKKyAg Zm9yIChpID0gMDsgaSA8IDQ7IGkrKykKKyAgICByW2ldID0gYVtpXSA8IGJbaV0gPyBhW2ldIDog YltpXTsKK30KKworLyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVyICJwbWluc2IiIH0gfSAq LworCit2b2lkIG1pbnUgKHZvaWQpCit7CisgIGludCBpOworCisgIGZvciAoaSA9IDA7IGkgPCA0 OyBpKyspCisgICAgdXJbaV0gPSB1YVtpXSA8IHViW2ldID8gdWFbaV0gOiB1YltpXTsKK30KKwor LyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVyICJwbWludWIiIH0gfSAqLworCit2b2lkIF9h YnMgKHZvaWQpCit7CisgIGludCBpOworCisgIGZvciAoaSA9IDA7IGkgPCA0OyBpKyspCisgICAg cltpXSA9IGFbaV0gPCAwID8gLWFbaV0gOiBhW2ldOworfQorCisvKiB7IGRnLWZpbmFsIHsgc2Nh bi1hc3NlbWJsZXIgInBhYnNiIiB9IH0gKi8KZGlmZiAtLWdpdCBhL2djYy90ZXN0c3VpdGUvZ2Nj LnRhcmdldC9pMzg2L3ByMTAwNjM3LTN3LmMgYi9nY2MvdGVzdHN1aXRlL2djYy50YXJnZXQvaTM4 Ni9wcjEwMDYzNy0zdy5jCm5ldyBmaWxlIG1vZGUgMTAwNjQ0CmluZGV4IDAwMDAwMDAwMDAwLi43 ZjE4ODJlN2E1NgotLS0gL2Rldi9udWxsCisrKyBiL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9p Mzg2L3ByMTAwNjM3LTN3LmMKQEAgLTAsMCArMSw4NiBAQAorLyogUFIgdGFyZ2V0LzEwMDYzNyAq LworLyogeyBkZy1kbyBjb21waWxlIH0gKi8KKy8qIHsgZGctb3B0aW9ucyAiLU8yIC1mdHJlZS12 ZWN0b3JpemUgLW1zc2U0IiB9ICovCisKK3Nob3J0IHJbMl0sIGFbMl0sIGJbMl07Cit1bnNpZ25l ZCBzaG9ydCB1clsyXSwgdWFbMl0sIHViWzJdOworCit2b2lkIG11bGggKHZvaWQpCit7CisgIGlu dCBpOworCisgIGZvciAoaSA9IDA7IGkgPCAyOyBpKyspCisgICAgcltpXSA9ICgoaW50KSBhW2ld ICogYltpXSkgPj4gMTY7Cit9CisKKy8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlciAicG11 bGh3IiB7IHhmYWlsICotKi0qIH0gfSB9ICovCisKK3ZvaWQgbXVsaHUgKHZvaWQpCit7CisgIGlu dCBpOworCisgIGZvciAoaSA9IDA7IGkgPCAyOyBpKyspCisgICAgdXJbaV0gPSAoKHVuc2lnbmVk IGludCkgdWFbaV0gKiB1YltpXSkgPj4gMTY7Cit9CisKKy8qIHsgZGctZmluYWwgeyBzY2FuLWFz c2VtYmxlciAicG11bGh1dyIgeyB4ZmFpbCAqLSotKiB9IH0gfSAqLworCit2b2lkIG11bGhycyAo dm9pZCkKK3sKKyAgaW50IGk7CisKKyAgZm9yIChpID0gMDsgaSA8IDI7IGkrKykKKyAgICByW2ld ID0gKCgoKGludCkgYVtpXSAqIGJbaV0pID4+IDE0KSArIDEpID4+IDE7Cit9CisKKy8qIHsgZGct ZmluYWwgeyBzY2FuLWFzc2VtYmxlciAicG11bGhyc3ciIH0gfSAqLworCit2b2lkIG1heHMgKHZv aWQpCit7CisgIGludCBpOworCisgIGZvciAoaSA9IDA7IGkgPCAyOyBpKyspCisgICAgcltpXSA9 IGFbaV0gPiBiW2ldID8gYVtpXSA6IGJbaV07Cit9CisKKy8qIHsgZGctZmluYWwgeyBzY2FuLWFz c2VtYmxlciAicG1heHN3IiB9IH0gKi8KKwordm9pZCBtYXh1ICh2b2lkKQoreworICBpbnQgaTsK KworICBmb3IgKGkgPSAwOyBpIDwgMjsgaSsrKQorICAgIHVyW2ldID0gdWFbaV0gPiB1YltpXSA/ IHVhW2ldIDogdWJbaV07Cit9CisKKy8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlciAicG1h eHV3IiB9IH0gKi8KKwordm9pZCBtaW5zICh2b2lkKQoreworICBpbnQgaTsKKworICBmb3IgKGkg PSAwOyBpIDwgMjsgaSsrKQorICAgIHJbaV0gPSBhW2ldIDwgYltpXSA/IGFbaV0gOiBiW2ldOwor fQorCisvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXIgInBtaW5zdyIgfSB9ICovCisKK3Zv aWQgbWludSAodm9pZCkKK3sKKyAgaW50IGk7CisKKyAgZm9yIChpID0gMDsgaSA8IDI7IGkrKykK KyAgICB1cltpXSA9IHVhW2ldIDwgdWJbaV0gPyB1YVtpXSA6IHViW2ldOworfQorCisvKiB7IGRn LWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXIgInBtaW51dyIgfSB9ICovCisKK3ZvaWQgX2FicyAodm9p ZCkKK3sKKyAgaW50IGk7CisKKyAgZm9yIChpID0gMDsgaSA8IDI7IGkrKykKKyAgICByW2ldID0g YVtpXSA8IDAgPyAtYVtpXSA6IGFbaV07Cit9CisKKy8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2Vt YmxlciAicGFic3ciIH0gfSAqLwpkaWZmIC0tZ2l0IGEvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0 L2kzODYvcHIxMDA2MzctNGIuYyBiL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9pMzg2L3ByMTAw NjM3LTRiLmMKbmV3IGZpbGUgbW9kZSAxMDA2NDQKaW5kZXggMDAwMDAwMDAwMDAuLjE5OGUzZGQz MzUyCi0tLSAvZGV2L251bGwKKysrIGIvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L2kzODYvcHIx MDA2MzctNGIuYwpAQCAtMCwwICsxLDE5IEBACisvKiBQUiB0YXJnZXQvMTAwNjM3ICovCisvKiB7 IGRnLWRvIGNvbXBpbGUgfSAqLworLyogeyBkZy1vcHRpb25zICItTzIgLWZ0cmVlLXZlY3Rvcml6 ZSAtbXNzZTIiIH0gKi8KKwordHlwZWRlZiBjaGFyIFQ7CisKKyNkZWZpbmUgTSA0CisKK2V4dGVy biBUIGFbTV0sIGJbTV0sIHMxW01dLCBzMltNXSwgcltNXTsKKwordm9pZCBmb28gKHZvaWQpCit7 CisgIGludCBqOworCisgIGZvciAoaiA9IDA7IGogPCBNOyBqKyspCisgICAgcltqXSA9IChhW2pd IDwgYltqXSkgPyBzMVtqXSA6IHMyW2pdOworfQorCisvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3Nl bWJsZXIgInBjbXBndGIiIH0gfSAqLwpkaWZmIC0tZ2l0IGEvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFy Z2V0L2kzODYvcHIxMDA2MzctNHcuYyBiL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9pMzg2L3By MTAwNjM3LTR3LmMKbmV3IGZpbGUgbW9kZSAxMDA2NDQKaW5kZXggMDAwMDAwMDAwMDAuLjBmNWRh Y2NlOTA2Ci0tLSAvZGV2L251bGwKKysrIGIvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L2kzODYv cHIxMDA2MzctNHcuYwpAQCAtMCwwICsxLDE5IEBACisvKiBQUiB0YXJnZXQvMTAwNjM3ICovCisv KiB7IGRnLWRvIGNvbXBpbGUgfSAqLworLyogeyBkZy1vcHRpb25zICItTzIgLWZ0cmVlLXZlY3Rv cml6ZSAtbXNzZTIiIH0gKi8KKwordHlwZWRlZiBzaG9ydCBUOworCisjZGVmaW5lIE0gMgorCitl eHRlcm4gVCBhW01dLCBiW01dLCBzMVtNXSwgczJbTV0sIHJbTV07CisKK3ZvaWQgZm9vICh2b2lk KQoreworICBpbnQgajsKKworICBmb3IgKGogPSAwOyBqIDwgTTsgaisrKQorICAgIHJbal0gPSAo YVtqXSA8IGJbal0pID8gczFbal0gOiBzMltqXTsKK30KKworLyogeyBkZy1maW5hbCB7IHNjYW4t YXNzZW1ibGVyICJwY21wZ3R3IiB7IHhmYWlsICotKi0qIH0gfSB9ICovCg== --00000000000001281905c2d78938--