From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x72d.google.com (mail-qk1-x72d.google.com [IPv6:2607:f8b0:4864:20::72d]) by sourceware.org (Postfix) with ESMTPS id BA8A23861010 for ; Wed, 26 May 2021 08:44:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org BA8A23861010 Received: by mail-qk1-x72d.google.com with SMTP id q10so231604qkc.5 for ; Wed, 26 May 2021 01:44:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AjPiDgs4+wWdxf/pzQCnEAP/jKIHhsfa7OpcVFZ/O1Y=; b=s2nTqkAHn9gxzAxSqfAcPE4jDk3kky+39Br5lsmbUUhM2V8cChdnhRMcwBp9aaUuUu 5XVV4m0WqokqkrcUo6YF3Dg3YHK6HIt8WfnoNpZB5EMwGjAqJxVkLHY+eqDiIy0owtrf 4nzrsffNkCjIH7k06Bpc60KeaUEvUs9JcjzPmD7JQOIZ3KEq3jx0dfxwyeh34dkkl8C4 ncPV8N0nEx+6d0QRrLa8+Lx9u094JVkXsj5ji0+mvh/Y0KMErhEYI9XVr+yh7lC2Xnug vxmHtn6MPYF9FMWMEQLiwMCUCAnc2QF6jlrW8fXYY36Dz+Hlq6EBstrPmwDgUUdzwQKP vaxA== X-Gm-Message-State: AOAM533OVBq/9BYVU1B1gCyegPH5X+f+PA3KDK6dAZQGoQvYSFewv+Vn dW9SNdiG+WT1XI4N+s4px3JSUxjCCJ30H1T/k/0= X-Google-Smtp-Source: ABdhPJxlFyRp7rpwr41TYCNYb4838JuMdmijwhZUDJ0O27rDGhTmlgiBXPz+PzVKM4JkKlOglaKk7CWAft2pCSGHFWY= X-Received: by 2002:a37:987:: with SMTP id 129mr28500368qkj.127.1622018644235; Wed, 26 May 2021 01:44:04 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Uros Bizjak Date: Wed, 26 May 2021 10:43:52 +0200 Message-ID: Subject: Re: [RFC PATCH] i386: Enable auto-vectorization for 32bit modes (+ testcases) To: Richard Biener Cc: "gcc-patches@gcc.gnu.org" Content-Type: multipart/mixed; boundary="0000000000001bfa0905c337a5e2" X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 May 2021 08:44:06 -0000 --0000000000001bfa0905c337a5e2 Content-Type: text/plain; charset="UTF-8" On Tue, May 25, 2021 at 4:29 PM Richard Biener wrote: > > On Fri, May 21, 2021 at 5:00 PM Uros Bizjak via Gcc-patches > wrote: > > > > Here it is, the patch that enables auto-vectorization for 32bit modes. > > > > Sent as RFC, because the patch fails some vectorizer scans, as it > > obviously enables more vectorization to happen: > > > > Running target unix > > FAIL: gcc.dg/vect/pr71264.c -flto -ffat-lto-objects scan-tree-dump > > vect "vectorized 1 loops in function" > > FAIL: gcc.dg/vect/pr71264.c scan-tree-dump vect "vectorized 1 loops in function" > > FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorized 1 loops" 1 > > FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorizing stmts using SLP" 1 > > FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorized 1 loops" 1 > > FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorizing > > stmts using SLP" 1 > > FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorized 3 loops" 1 > > FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorizing stmts using SLP" 3 > > FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorized 3 loops" 1 > > FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorizing stmts > > using SLP" 3 > > > > > > Running target unix/-m32 > > FAIL: gcc.dg/vect/no-vfa-vect-101.c scan-tree-dump-times vect "can't > > determine dependence" 1 > > FAIL: gcc.dg/vect/no-vfa-vect-102.c scan-tree-dump-times vect > > "possible dependence between data-refs" 1 > > FAIL: gcc.dg/vect/no-vfa-vect-102a.c scan-tree-dump-times vect > > "possible dependence between data-refs" 1 > > FAIL: gcc.dg/vect/no-vfa-vect-37.c scan-tree-dump-times vect "can't > > determine dependence" 2 > > FAIL: gcc.dg/vect/pr71264.c -flto -ffat-lto-objects scan-tree-dump > > vect "vectorized 1 loops in function" > > FAIL: gcc.dg/vect/pr71264.c scan-tree-dump vect "vectorized 1 loops in function" > > FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorized 1 loops" 1 > > FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorizing stmts using SLP" 1 > > FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorized 1 loops" 1 > > FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorizing > > stmts using SLP" 1 > > FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorized 3 loops" 1 > > FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorizing stmts using SLP" 3 > > FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorized 3 loops" 1 > > FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorizing stmts > > using SLP" 3 > > FAIL: gcc.dg/vect/vect-104.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "possible dependence between data-refs" 1 > > FAIL: gcc.dg/vect/vect-104.c scan-tree-dump-times vect "possible > > dependence between data-refs" 1 > > Yeah, it's a bit iffy to adjust expectations. If there's a way to > disable vectorization > for 32bit modes on x86 that might be a way to "fix" them, otherwise we're > lacking a way to query for available vector modes/sizes in the dejagnu vect > targets. There's available_vector_sizes but it's implementation is hardly > complete nor is size the only important thing (FP vs. INT). At least > one could add a vect32 predicate similar to the existing vect64 one. I went the way you proposed above. By adding 32bit vector size to available_vector_sizes only two testcases fails. The attached patch fixes all vect scan failures (the remaining failure in vect_epilogues.c is just the case of missing uavg3_ceil pattern for V4QI epilogue vectorization - I plan to add the insn in the follow-up patch). The patch also xfails pr71264.c, the case of missing re-vectorization of 32bit vectors. WDYT? Uros. --0000000000001bfa0905c337a5e2 Content-Type: text/plain; charset="US-ASCII"; name="p.diff.txt" Content-Disposition: attachment; filename="p.diff.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_kp580gfq0 ZGlmZiAtLWdpdCBhL2djYy9jb25maWcvaTM4Ni9pMzg2LmMgYi9nY2MvY29uZmlnL2kzODYvaTM4 Ni5jCmluZGV4IDI4ZTYxMTNhNjA5Li4wNDY0OWI0MjEyMiAxMDA2NDQKLS0tIGEvZ2NjL2NvbmZp Zy9pMzg2L2kzODYuYworKysgYi9nY2MvY29uZmlnL2kzODYvaTM4Ni5jCkBAIC0yMjE5MCwxMiAr MjIxOTAsMTUgQEAgaXg4Nl9hdXRvdmVjdG9yaXplX3ZlY3Rvcl9tb2RlcyAodmVjdG9yX21vZGVz ICptb2RlcywgYm9vbCBhbGwpCiAgICAgICBtb2Rlcy0+c2FmZV9wdXNoIChWMTZRSW1vZGUpOwog ICAgICAgbW9kZXMtPnNhZmVfcHVzaCAoVjMyUUltb2RlKTsKICAgICB9Ci0gIGVsc2UgaWYgKFRB UkdFVF9NTVhfV0lUSF9TU0UpCisgIGVsc2UgaWYgKFRBUkdFVF9TU0UyKQogICAgIG1vZGVzLT5z YWZlX3B1c2ggKFYxNlFJbW9kZSk7CiAKICAgaWYgKFRBUkdFVF9NTVhfV0lUSF9TU0UpCiAgICAg bW9kZXMtPnNhZmVfcHVzaCAoVjhRSW1vZGUpOwogCisgIGlmIChUQVJHRVRfU1NFMikKKyAgICBt b2Rlcy0+c2FmZV9wdXNoIChWNFFJbW9kZSk7CisKICAgcmV0dXJuIDA7CiB9CiAKZGlmZiAtLWdp dCBhL2djYy9kb2Mvc291cmNlYnVpbGQudGV4aSBiL2djYy9kb2Mvc291cmNlYnVpbGQudGV4aQpp bmRleCBjZjMwOTg3NDljMC4uMTZjNmEzYjhlOTkgMTAwNjQ0Ci0tLSBhL2djYy9kb2Mvc291cmNl YnVpbGQudGV4aQorKysgYi9nY2MvZG9jL3NvdXJjZWJ1aWxkLnRleGkKQEAgLTE3NDAsNiArMTc0 MCwxMiBAQCBjaXJjdW1zdGFuY2VzLgogQGl0ZW0gdmVjdF92YXJpYWJsZV9sZW5ndGgKIFRhcmdl dCBoYXMgdmFyaWFibGUtbGVuZ3RoIHZlY3RvcnMuCiAKK0BpdGVtIHZlY3Q2NAorVGFyZ2V0IHN1 cHBvcnRzIHZlY3RvcnMgb2YgNjQgYml0cy4KKworQGl0ZW0gdmVjdDMyCitUYXJnZXQgc3VwcG9y dHMgdmVjdG9ycyBvZiAzMiBiaXRzLgorCiBAaXRlbSB2ZWN0X3dpZGVuX3N1bV9oaV90b19zaQog VGFyZ2V0IHN1cHBvcnRzIGEgdmVjdG9yIHdpZGVuaW5nIHN1bW1hdGlvbiBvZiBAY29kZXtzaG9y dH0gb3BlcmFuZHMKIGludG8gQGNvZGV7aW50fSByZXN1bHRzLCBvciBjYW4gcHJvbW90ZSAodW5w YWNrKSBmcm9tIEBjb2Rle3Nob3J0fQpkaWZmIC0tZ2l0IGEvZ2NjL3Rlc3RzdWl0ZS9nY2MuZGcv dmVjdC9wcjcxMjY0LmMgYi9nY2MvdGVzdHN1aXRlL2djYy5kZy92ZWN0L3ByNzEyNjQuYwppbmRl eCBkYzg0OWJmMjc5Ny4uMTM4MWUwZWQxMzIgMTAwNjQ0Ci0tLSBhL2djYy90ZXN0c3VpdGUvZ2Nj LmRnL3ZlY3QvcHI3MTI2NC5jCisrKyBiL2djYy90ZXN0c3VpdGUvZ2NjLmRnL3ZlY3QvcHI3MTI2 NC5jCkBAIC0xOSw1ICsxOSw0IEBAIHZvaWQgdGVzdCh1aW50OF90ICpwdHIsIHVpbnQ4X3QgKm1h c2spCiAgICAgfQogfQogCi0vKiB7IGRnLWZpbmFsIHsgc2Nhbi10cmVlLWR1bXAgInZlY3Rvcml6 ZWQgMSBsb29wcyBpbiBmdW5jdGlvbiIgInZlY3QiIHsgeGZhaWwgczM5MCotKi0qIHNwYXJjKi0q LSogfSB9IH0gKi8KLQorLyogeyBkZy1maW5hbCB7IHNjYW4tdHJlZS1kdW1wICJ2ZWN0b3JpemVk IDEgbG9vcHMgaW4gZnVuY3Rpb24iICJ2ZWN0IiB7IHhmYWlsIHsgeyBzMzkwKi0qLSogc3BhcmMq LSotKiB9IHx8IHZlY3QzMiB9IH0gfSB9ICovCmRpZmYgLS1naXQgYS9nY2MvdGVzdHN1aXRlL2dj Yy5kZy92ZWN0L3NscC0yOC5jIGIvZ2NjL3Rlc3RzdWl0ZS9nY2MuZGcvdmVjdC9zbHAtMjguYwpp bmRleCA3Nzc4YmFkNDQ2NS4uMGJiNWYwZWIwZTQgMTAwNjQ0Ci0tLSBhL2djYy90ZXN0c3VpdGUv Z2NjLmRnL3ZlY3Qvc2xwLTI4LmMKKysrIGIvZ2NjL3Rlc3RzdWl0ZS9nY2MuZGcvdmVjdC9zbHAt MjguYwpAQCAtODgsNiArODgsNyBAQCBpbnQgbWFpbiAodm9pZCkKICAgcmV0dXJuIDA7CiB9CiAK LS8qIHsgZGctZmluYWwgeyBzY2FuLXRyZWUtZHVtcC10aW1lcyAidmVjdG9yaXplZCAxIGxvb3Bz IiAxICJ2ZWN0IiAgfSB9ICovCi0vKiB7IGRnLWZpbmFsIHsgc2Nhbi10cmVlLWR1bXAtdGltZXMg InZlY3Rvcml6aW5nIHN0bXRzIHVzaW5nIFNMUCIgMSAidmVjdCIgfSB9ICovCisvKiB7IGRnLWZp bmFsIHsgc2Nhbi10cmVlLWR1bXAtdGltZXMgInZlY3Rvcml6ZWQgMSBsb29wcyIgMSAidmVjdCIg eyB0YXJnZXQgeyAhIHZlY3QzMiB9IH0gfSB9ICovCisvKiB7IGRnLWZpbmFsIHsgc2Nhbi10cmVl LWR1bXAtdGltZXMgInZlY3Rvcml6ZWQgMiBsb29wcyIgMSAidmVjdCIgeyB0YXJnZXQgdmVjdDMy IH0gfSB9ICovCisvKiB7IGRnLWZpbmFsIHsgc2Nhbi10cmVlLWR1bXAtdGltZXMgInZlY3Rvcml6 aW5nIHN0bXRzIHVzaW5nIFNMUCIgMSAidmVjdCIgeyB0YXJnZXQgeyAhIHZlY3QzMiB9IH0gfSB9 ICovCiAgIApkaWZmIC0tZ2l0IGEvZ2NjL3Rlc3RzdWl0ZS9nY2MuZGcvdmVjdC9zbHAtMy5jIGIv Z2NjL3Rlc3RzdWl0ZS9nY2MuZGcvdmVjdC9zbHAtMy5jCmluZGV4IDQ2YWI1ODQ0MTlhLi44MGRl ZDE4NDBhZCAxMDA2NDQKLS0tIGEvZ2NjL3Rlc3RzdWl0ZS9nY2MuZGcvdmVjdC9zbHAtMy5jCisr KyBiL2djYy90ZXN0c3VpdGUvZ2NjLmRnL3ZlY3Qvc2xwLTMuYwpAQCAtMTQxLDggKzE0MSw4IEBA IGludCBtYWluICh2b2lkKQogICByZXR1cm4gMDsKIH0KIAotLyogeyBkZy1maW5hbCB7IHNjYW4t dHJlZS1kdW1wLXRpbWVzICJ2ZWN0b3JpemVkIDMgbG9vcHMiIDEgInZlY3QiIHsgdGFyZ2V0IHsg ISB2ZWN0X3BhcnRpYWxfdmVjdG9ycyB9IH0gfSB9ICovCi0vKiB7IGRnLWZpbmFsIHsgc2Nhbi10 cmVlLWR1bXAtdGltZXMgInZlY3Rvcml6ZWQgNCBsb29wcyIgMSAidmVjdCIgeyB0YXJnZXQgdmVj dF9wYXJ0aWFsX3ZlY3RvcnMgfSB9IH0gKi8KLS8qIHsgZGctZmluYWwgeyBzY2FuLXRyZWUtZHVt cC10aW1lcyAidmVjdG9yaXppbmcgc3RtdHMgdXNpbmcgU0xQIiAzICJ2ZWN0IiB7IHRhcmdldCB7 ICEgdmVjdF9wYXJ0aWFsX3ZlY3RvcnMgfSB9IH0gfSovCi0vKiB7IGRnLWZpbmFsIHsgc2Nhbi10 cmVlLWR1bXAtdGltZXMgInZlY3Rvcml6aW5nIHN0bXRzIHVzaW5nIFNMUCIgNCAidmVjdCIgeyB0 YXJnZXQgdmVjdF9wYXJ0aWFsX3ZlY3RvcnMgfSB9IH0gKi8KKy8qIHsgZGctZmluYWwgeyBzY2Fu LXRyZWUtZHVtcC10aW1lcyAidmVjdG9yaXplZCAzIGxvb3BzIiAxICJ2ZWN0IiB7IHRhcmdldCB7 ICEgeyB2ZWN0X3BhcnRpYWxfdmVjdG9ycyB8fCB2ZWN0MzIgfSB9IH0gfSB9ICovCisvKiB7IGRn LWZpbmFsIHsgc2Nhbi10cmVlLWR1bXAtdGltZXMgInZlY3Rvcml6ZWQgNCBsb29wcyIgMSAidmVj dCIgeyB0YXJnZXQgeyB2ZWN0X3BhcnRpYWxfdmVjdG9ycyB8fCB2ZWN0MzIgfSB9IH0gfSAqLwor LyogeyBkZy1maW5hbCB7IHNjYW4tdHJlZS1kdW1wLXRpbWVzICJ2ZWN0b3JpemluZyBzdG10cyB1 c2luZyBTTFAiIDMgInZlY3QiIHsgdGFyZ2V0IHsgISB7IHZlY3RfcGFydGlhbF92ZWN0b3JzIHx8 IHZlY3QzMiB9IH0gfSB9IH0qLworLyogeyBkZy1maW5hbCB7IHNjYW4tdHJlZS1kdW1wLXRpbWVz ICJ2ZWN0b3JpemluZyBzdG10cyB1c2luZyBTTFAiIDQgInZlY3QiIHsgdGFyZ2V0IHsgdmVjdF9w YXJ0aWFsX3ZlY3RvcnMgfHwgdmVjdDMyIH0gfSB9IH0gKi8KICAgCmRpZmYgLS1naXQgYS9nY2Mv dGVzdHN1aXRlL2xpYi90YXJnZXQtc3VwcG9ydHMuZXhwIGIvZ2NjL3Rlc3RzdWl0ZS9saWIvdGFy Z2V0LXN1cHBvcnRzLmV4cAppbmRleCA4NDlmMWJiZWRhNS4uN2Y3OGM1NTkzYWMgMTAwNjQ0Ci0t LSBhL2djYy90ZXN0c3VpdGUvbGliL3RhcmdldC1zdXBwb3J0cy5leHAKKysrIGIvZ2NjL3Rlc3Rz dWl0ZS9saWIvdGFyZ2V0LXN1cHBvcnRzLmV4cApAQCAtNzYyNiw2ICs3NjI2LDcgQEAgcHJvYyBh dmFpbGFibGVfdmVjdG9yX3NpemVzIHsgfSB7CiAJaWYgeyAhW2lzLWVmZmVjdGl2ZS10YXJnZXQg aWEzMl0gfSB7CiAJICAgIGxhcHBlbmQgcmVzdWx0IDY0CiAJfQorCWxhcHBlbmQgcmVzdWx0IDMy CiAgICAgfSBlbHNlaWYgeyBbaXN0YXJnZXQgc3BhcmMqLSotKl0gfSB7CiAJbGFwcGVuZCByZXN1 bHQgNjQKICAgICB9IGVsc2VpZiB7IFtpc3RhcmdldCBhbWRnY24qLSotKl0gfSB7CkBAIC03NjU1 LDYgKzc2NTYsMTIgQEAgcHJvYyBjaGVja19lZmZlY3RpdmVfdGFyZ2V0X3ZlY3Q2NCB7IH0gewog ICAgIHJldHVybiBbZXhwciB7IFtsc2VhcmNoIC1leGFjdCBbYXZhaWxhYmxlX3ZlY3Rvcl9zaXpl c10gNjRdID49IDAgfV0KIH0KIAorIyBSZXR1cm4gMSBpZiB0aGUgdGFyZ2V0IHN1cHBvcnRzIHZl Y3RvcnMgb2YgMzIgYml0cy4KKworcHJvYyBjaGVja19lZmZlY3RpdmVfdGFyZ2V0X3ZlY3QzMiB7 IH0geworICAgIHJldHVybiBbZXhwciB7IFtsc2VhcmNoIC1leGFjdCBbYXZhaWxhYmxlX3ZlY3Rv cl9zaXplc10gMzJdID49IDAgfV0KK30KKwogIyBSZXR1cm4gMSBpZiB0aGUgdGFyZ2V0IHN1cHBv cnRzIHZlY3RvciBjb3B5c2lnbmYgY2FsbHMuCiAKIHByb2MgY2hlY2tfZWZmZWN0aXZlX3Rhcmdl dF92ZWN0X2NhbGxfY29weXNpZ25mIHsgfSB7Cg== --0000000000001bfa0905c337a5e2--