From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 44D2B3858D28 for ; Tue, 27 Jun 2023 12:01:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 44D2B3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-3fa7512e599so54477975e9.2 for ; Tue, 27 Jun 2023 05:01:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1687867314; x=1690459314; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=Oonkb5+b2FjRI9z1RFpnFXzslZASWE+3GwvA2Pv9Juc=; b=OpgVAqkwL0VXKR1z+0Aw7Z9OgMmESW4JHm/Rp3s+O72VmxDX+Y/yfw1Ap7N+25GVUk yRScho1cfC4sXCLqAeat258Uf++OXnBx1xpk4nb4S4VQOhlODP5Q9TvjLOWmtgOEtS7n YRGf6DBy5grR5doKNLEiHYQ6SHdOSIH1A4YV7kVgEJ0IOTrAZni0T2IUjKWyyjKo2l43 XUJyQ9GSfwafGbOjmHtH7XZgX0EOPbkMYGs5b2uCd7BLEN/URunrHOw39uuwGSflgcMq MoEHKHgIYv4EU5gnS46RVjbRvMngEsRnla1KFdbFaCsxnBXO1xg7WDdorWYP5Im7HEwb N5/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687867314; x=1690459314; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Oonkb5+b2FjRI9z1RFpnFXzslZASWE+3GwvA2Pv9Juc=; b=kY2R4hBqsEwwW5PlLUajHQP3+F+BysgDsmY0MC1WUEqUfJBZLH245s31w3ZJZHCLCW Pu3NXl+H9ZrDLwfQtzbgPVy/rJ97KUUI/yBThfl/OeiF0ZpPCJY05C+hgPqr5/9Rh9Bw zjewirT/5nwqfEs06yF7IHSX2O5pFho54osq5/kTiwsddp0s/l2nb9Ep1EvnvVrYA8ju qRqQfBQM2C4JIrQgJrGKTvYH4qBCQqydhw7S4dnhyFvY6pA+I/FH04yPrcu4cPfnXmns mob0aCQRtvFmyRDflqwQyiWZi7HgmmMu4Fqq7XJ1doyiTPIioZS/mHAgGG8v9JavrlzS +IJw== X-Gm-Message-State: AC+VfDxDeNzuFVMCA7nfJcIBBFEJ+kNvjkxJgiD2qSG0DriKSHoTbBGJ eELy6YA7JB1J7z7RYoLmkeGfrzwXQm2t1xntaizWmA== X-Google-Smtp-Source: ACHHUZ4tbdBOQmlEbCQ6gY2hotywUp4OXavJNXnxdTwY2iD4R4n3/M7H7EMKC3DjQ56C8rBpSoJb3tELU6MRXX8ADQs= X-Received: by 2002:a05:6000:181:b0:30f:d218:584a with SMTP id p1-20020a056000018100b0030fd218584amr28171968wrx.23.1687867313760; Tue, 27 Jun 2023 05:01:53 -0700 (PDT) MIME-Version: 1.0 From: Prathamesh Kulkarni Date: Tue, 27 Jun 2023 17:31:18 +0530 Message-ID: Subject: [SVE] Fold svdupq to VEC_PERM_EXPR if elements are not constant To: Richard Sandiford , gcc Patches Content-Type: multipart/mixed; boundary="000000000000aa8b7905ff1b3bfc" X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000aa8b7905ff1b3bfc Content-Type: text/plain; charset="UTF-8" Hi Richard, Sorry I forgot to commit this patch, which you had approved in: https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615308.html Just for context for the following test: svint32_t f_s32(int32x4_t x) { return svdupq_s32 (x[0], x[1], x[2], x[3]); } -O3 -mcpu=generic+sve generates following code after interleave+zip1 patch: f_s32: dup s31, v0.s[1] mov v30.8b, v0.8b ins v31.s[1], v0.s[3] ins v30.s[1], v0.s[2] zip1 v0.4s, v30.4s, v31.4s dup z0.q, z0.q[0] ret Code-gen with attached patch: f_s32: dup z0.q, z0.q[0] ret Bootstrapped+tested on aarch64-linux-gnu. OK to commit ? Thanks, Prathamesh --000000000000aa8b7905ff1b3bfc Content-Type: text/plain; charset="US-ASCII"; name="gnu-829-3.txt" Content-Disposition: attachment; filename="gnu-829-3.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lje8khlp0 W1NWRV0gRm9sZCBzdmR1cHEgdG8gVkVDX1BFUk1fRVhQUiBpZiBlbGVtZW50cyBhcmUgbm90IGNv bnN0YW50LgoKZ2NjL0NoYW5nZUxvZzoKICAgICAgICAqIGNvbmZpZy9hYXJjaDY0L2FhcmNoNjQt c3ZlLWJ1aWx0aW5zLWJhc2UuY2MKICAgICAgICAoc3ZkdXBxX2ltcGw6OmZvbGRfbm9uY29uc3Rf ZHVwcSk6IE5ldyBtZXRob2QuCiAgICAgICAgKHN2ZHVwcV9pbXBsOjpmb2xkKTogQ2FsbCBmb2xk X25vbmNvbnN0X2R1cHEuCgpnY2MvdGVzdHN1aXRlL0NoYW5nZUxvZzoKICAgICAgICAqIGdjYy50 YXJnZXQvYWFyY2g2NC9zdmUvYWNsZS9nZW5lcmFsL2R1cHFfMTEuYzogTmV3IHRlc3QuCgpkaWZm IC0tZ2l0IGEvZ2NjL2NvbmZpZy9hYXJjaDY0L2FhcmNoNjQtc3ZlLWJ1aWx0aW5zLWJhc2UuY2Mg Yi9nY2MvY29uZmlnL2FhcmNoNjQvYWFyY2g2NC1zdmUtYnVpbHRpbnMtYmFzZS5jYwppbmRleCA5 NWI0Y2I4YTk0My4uOTAxMGVjY2E2ZGEgMTAwNjQ0Ci0tLSBhL2djYy9jb25maWcvYWFyY2g2NC9h YXJjaDY0LXN2ZS1idWlsdGlucy1iYXNlLmNjCisrKyBiL2djYy9jb25maWcvYWFyY2g2NC9hYXJj aDY0LXN2ZS1idWlsdGlucy1iYXNlLmNjCkBAIC04MTcsNiArODE3LDUyIEBAIHB1YmxpYzoKIAog Y2xhc3Mgc3ZkdXBxX2ltcGwgOiBwdWJsaWMgcXVpZXQ8ZnVuY3Rpb25fYmFzZT4KIHsKK3ByaXZh dGU6CisgIGdpbXBsZSAqCisgIGZvbGRfbm9uY29uc3RfZHVwcSAoZ2ltcGxlX2ZvbGRlciAmZikg Y29uc3QKKyAgeworICAgIC8qIExvd2VyIGxocyA9IHN2ZHVwcSAoYXJnMCwgYXJnMSwgLi4uLCBh cmdOfSBpbnRvOgorICAgICAgIHRtcCA9IHthcmcwLCBhcmcxLCAuLi4sIGFyZzxOLTE+fQorICAg ICAgIGxocyA9IFZFQ19QRVJNX0VYUFIgKHRtcCwgdG1wLCB7MCwgMSwgMiwgTi0xLCAuLi59KSAg Ki8KKworICAgIGlmIChmLnR5cGVfc3VmZml4ICgwKS5ib29sX3AKKwl8fCBCWVRFU19CSUdfRU5E SUFOKQorICAgICAgcmV0dXJuIE5VTEw7CisKKyAgICB0cmVlIGxocyA9IGdpbXBsZV9jYWxsX2xo cyAoZi5jYWxsKTsKKyAgICB0cmVlIGxoc190eXBlID0gVFJFRV9UWVBFIChsaHMpOworICAgIHRy ZWUgZWx0X3R5cGUgPSBUUkVFX1RZUEUgKGxoc190eXBlKTsKKyAgICBzY2FsYXJfbW9kZSBlbHRf bW9kZSA9IFNDQUxBUl9UWVBFX01PREUgKGVsdF90eXBlKTsKKyAgICBtYWNoaW5lX21vZGUgdnFf bW9kZSA9IGFhcmNoNjRfdnFfbW9kZSAoZWx0X21vZGUpLnJlcXVpcmUgKCk7CisgICAgdHJlZSB2 cV90eXBlID0gYnVpbGRfdmVjdG9yX3R5cGVfZm9yX21vZGUgKGVsdF90eXBlLCB2cV9tb2RlKTsK KworICAgIHVuc2lnbmVkIG5hcmdzID0gZ2ltcGxlX2NhbGxfbnVtX2FyZ3MgKGYuY2FsbCk7Cisg ICAgdmVjPGNvbnN0cnVjdG9yX2VsdCwgdmFfZ2M+ICp2OworICAgIHZlY19hbGxvYyAodiwgbmFy Z3MpOworICAgIGZvciAodW5zaWduZWQgaSA9IDA7IGkgPCBuYXJnczsgaSsrKQorICAgICAgQ09O U1RSVUNUT1JfQVBQRU5EX0VMVCAodiwgTlVMTF9UUkVFLCBnaW1wbGVfY2FsbF9hcmcgKGYuY2Fs bCwgaSkpOworICAgIHRyZWUgdmVjID0gYnVpbGRfY29uc3RydWN0b3IgKHZxX3R5cGUsIHYpOwor ICAgIHRyZWUgdG1wID0gbWFrZV9zc2FfbmFtZV9mbiAoY2Z1biwgdnFfdHlwZSwgMCk7CisgICAg Z2ltcGxlICpnID0gZ2ltcGxlX2J1aWxkX2Fzc2lnbiAodG1wLCB2ZWMpOworCisgICAgZ2ltcGxl X3NlcSBzdG10cyA9IE5VTEw7CisgICAgZ2ltcGxlX3NlcV9hZGRfc3RtdF93aXRob3V0X3VwZGF0 ZSAoJnN0bXRzLCBnKTsKKworICAgIHBvbHlfdWludDY0IGxoc19sZW4gPSBUWVBFX1ZFQ1RPUl9T VUJQQVJUUyAobGhzX3R5cGUpOworICAgIHZlY19wZXJtX2J1aWxkZXIgc2VsIChsaHNfbGVuLCBu YXJncywgMSk7CisgICAgZm9yICh1bnNpZ25lZCBpID0gMDsgaSA8IG5hcmdzOyBpKyspCisgICAg ICBzZWwucXVpY2tfcHVzaCAoaSk7CisKKyAgICB2ZWNfcGVybV9pbmRpY2VzIGluZGljZXMgKHNl bCwgMSwgbmFyZ3MpOworICAgIHRyZWUgbWFza190eXBlID0gYnVpbGRfdmVjdG9yX3R5cGUgKHNz aXpldHlwZSwgbGhzX2xlbik7CisgICAgdHJlZSBtYXNrID0gdmVjX3Blcm1faW5kaWNlc190b190 cmVlIChtYXNrX3R5cGUsIGluZGljZXMpOworCisgICAgZ2ltcGxlICpnMiA9IGdpbXBsZV9idWls ZF9hc3NpZ24gKGxocywgVkVDX1BFUk1fRVhQUiwgdG1wLCB0bXAsIG1hc2spOworICAgIGdpbXBs ZV9zZXFfYWRkX3N0bXRfd2l0aG91dF91cGRhdGUgKCZzdG10cywgZzIpOworICAgIGdzaV9yZXBs YWNlX3dpdGhfc2VxIChmLmdzaSwgc3RtdHMsIGZhbHNlKTsKKyAgICByZXR1cm4gZzI7CisgIH0K KwogcHVibGljOgogICBnaW1wbGUgKgogICBmb2xkIChnaW1wbGVfZm9sZGVyICZmKSBjb25zdCBv dmVycmlkZQpAQCAtODMyLDcgKzg3OCw3IEBAIHB1YmxpYzoKICAgICAgIHsKIAl0cmVlIGVsdCA9 IGdpbXBsZV9jYWxsX2FyZyAoZi5jYWxsLCBpKTsKIAlpZiAoIUNPTlNUQU5UX0NMQVNTX1AgKGVs dCkpCi0JICByZXR1cm4gTlVMTDsKKwkgIHJldHVybiBmb2xkX25vbmNvbnN0X2R1cHEgKGYpOwog CWJ1aWxkZXIucXVpY2tfcHVzaCAoZWx0KTsKIAlmb3IgKHVuc2lnbmVkIGludCBqID0gMTsgaiA8 IGZhY3RvcjsgKytqKQogCSAgYnVpbGRlci5xdWlja19wdXNoIChidWlsZF96ZXJvX2NzdCAoVFJF RV9UWVBFICh2ZWNfdHlwZSkpKTsKZGlmZiAtLWdpdCBhL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdl dC9hYXJjaDY0L3N2ZS9hY2xlL2dlbmVyYWwvZHVwcV8xMS5jIGIvZ2NjL3Rlc3RzdWl0ZS9nY2Mu dGFyZ2V0L2FhcmNoNjQvc3ZlL2FjbGUvZ2VuZXJhbC9kdXBxXzExLmMKbmV3IGZpbGUgbW9kZSAx MDA2NDQKaW5kZXggMDAwMDAwMDAwMDAuLmYxOWY4ZGViMWU1Ci0tLSAvZGV2L251bGwKKysrIGIv Z2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L2FhcmNoNjQvc3ZlL2FjbGUvZ2VuZXJhbC9kdXBxXzEx LmMKQEAgLTAsMCArMSwzMSBAQAorLyogeyBkZy1kbyBjb21waWxlIH0gKi8KKy8qIHsgZGctb3B0 aW9ucyAiLU8zIC1mZHVtcC10cmVlLW9wdGltaXplZCIgfSAqLworCisjaW5jbHVkZSA8YXJtX3N2 ZS5oPgorI2luY2x1ZGUgPGFybV9uZW9uLmg+CisKK3N2aW50OF90IGZfczgoaW50OHgxNl90IHgp Cit7CisgIHJldHVybiBzdmR1cHFfczggKHhbMF0sIHhbMV0sIHhbMl0sIHhbM10sIHhbNF0sIHhb NV0sIHhbNl0sIHhbN10sCisJCSAgICB4WzhdLCB4WzldLCB4WzEwXSwgeFsxMV0sIHhbMTJdLCB4 WzEzXSwgeFsxNF0sIHhbMTVdKTsKK30KKworc3ZpbnQxNl90IGZfczE2KGludDE2eDhfdCB4KQor eworICByZXR1cm4gc3ZkdXBxX3MxNiAoeFswXSwgeFsxXSwgeFsyXSwgeFszXSwgeFs0XSwgeFs1 XSwgeFs2XSwgeFs3XSk7Cit9CisKK3N2aW50MzJfdCBmX3MzMihpbnQzMng0X3QgeCkKK3sKKyAg cmV0dXJuIHN2ZHVwcV9zMzIgKHhbMF0sIHhbMV0sIHhbMl0sIHhbM10pOworfQorCitzdmludDY0 X3QgZl9zNjQoaW50NjR4Ml90IHgpCit7CisgIHJldHVybiBzdmR1cHFfczY0ICh4WzBdLCB4WzFd KTsKK30KKworLyogeyBkZy1maW5hbCB7IHNjYW4tdHJlZS1kdW1wICJWRUNfUEVSTV9FWFBSIiAi b3B0aW1pemVkIiB9IH0gKi8KKy8qIHsgZGctZmluYWwgeyBzY2FuLXRyZWUtZHVtcC1ub3QgInN2 ZHVwcSIgIm9wdGltaXplZCIgfSB9ICovCisKKy8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxl ci10aW1lcyB7XHRkdXBcdHpbMC05XStcLnEsIHpbMC05XStcLnFcWzBcXVxufSA0IH0gfSAqLwo= --000000000000aa8b7905ff1b3bfc--