From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 14F893858408 for ; Wed, 18 Oct 2023 14:41:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 14F893858408 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 14F893858408 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697640073; cv=none; b=VPqpZGzL0gjhI2346jSD4tt6epNJv4R+pif3WgigU7p17KHjmtA6S2Qj8yakUB7kWUtKdOeRPGh3lt/JwM13dhMZ3cXimO4Cvh/MToHJWO7f6FLajxUCevqUJTPWnOL2pEumjbsdWkXgoCT9FqfY8beyKnHXITnonQCmaFx4T/U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697640073; c=relaxed/simple; bh=B6s7yM0GmDgZVTKT+oYzB/7XkLB3ntDSP45SoWgCDtk=; h=Message-ID:Date:MIME-Version:Subject:To:From; b=JdV3+xq2Cf/1zljStRqEP3IupdsDgZjTRpgrw5JclcdkcFIeF91NnUQs4gKAez9XqALePwEUazq7EYER0K5uymKX5jczAaDKdUPhH9qoDNc63Ybh8pNvwYam0MUnkizmC8XEpSaaT8Xvbpskm9I8uaCJzHtZ1e7KAK0b3Hm1yH0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E287F2F4; Wed, 18 Oct 2023 07:41:50 -0700 (PDT) Received: from [10.57.67.225] (unknown [10.57.67.225]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 211753F64C; Wed, 18 Oct 2023 07:41:09 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------ijjz3xaHDvokjmdH2oZ0ARH0" Message-ID: Date: Wed, 18 Oct 2023 15:41:08 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 5/8] vect: Use inbranch simdclones in masked loops Content-Language: en-US To: gcc-patches@gcc.gnu.org Cc: Richard Biener , Richard Sandiford , "jakub@redhat.com" References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> From: "Andre Vieira (lists)" In-Reply-To: X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00,BODY_8BITS,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multi-part message in MIME format. --------------ijjz3xaHDvokjmdH2oZ0ARH0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Rebased, needs review. On 30/08/2023 10:13, Andre Vieira (lists) via Gcc-patches wrote: > This patch enables the compiler to use inbranch simdclones when > generating masked loops in autovectorization. > > gcc/ChangeLog: > >     * omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function >     compatible with mask parameters in clone. >     * tree-vect-stmts.cc (vect_convert): New helper function. >     (vect_build_all_ones_mask): Allow vector boolean typed masks. >     (vectorizable_simd_clone_call): Enable the use of masked clones in >     fully masked loops. --------------ijjz3xaHDvokjmdH2oZ0ARH0 Content-Type: text/plain; charset=UTF-8; name="sve_simd_clones_5v2.patch" Content-Disposition: attachment; filename="sve_simd_clones_5v2.patch" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL2djYy9vbXAtc2ltZC1jbG9uZS5jYyBiL2djYy9vbXAtc2ltZC1jbG9u ZS5jYwppbmRleCBhNDI2NDM0MDBkZGNmMTA5NjE2MzM0NDhiNDlkNGNhYWZiOTk5ZjEyLi5l ZjBiOWI0OGM3MjEyOTAwMDIzYmMwZWFlYmNhNWUxZjkzODlkYjc3IDEwMDY0NAotLS0gYS9n Y2Mvb21wLXNpbWQtY2xvbmUuY2MKKysrIGIvZ2NjL29tcC1zaW1kLWNsb25lLmNjCkBAIC04 MDcsOCArODA3LDE0IEBAIHNpbWRfY2xvbmVfYWRqdXN0X2FyZ3VtZW50X3R5cGVzIChzdHJ1 Y3QgY2dyYXBoX25vZGUgKm5vZGUpCiAgICAgewogICAgICAgaXBhX2FkanVzdGVkX3BhcmFt IGFkajsKICAgICAgIG1lbXNldCAoJmFkaiwgMCwgc2l6ZW9mIChhZGopKTsKLSAgICAgIHRy ZWUgcGFybSA9IGFyZ3NbaV07Ci0gICAgICB0cmVlIHBhcm1fdHlwZSA9IG5vZGUtPmRlZmlu aXRpb24gPyBUUkVFX1RZUEUgKHBhcm0pIDogcGFybTsKKyAgICAgIHRyZWUgcGFybSA9IE5V TExfVFJFRTsKKyAgICAgIHRyZWUgcGFybV90eXBlID0gTlVMTF9UUkVFOworICAgICAgaWYo aSA8IGFyZ3MubGVuZ3RoKCkpCisJeworCSAgcGFybSA9IGFyZ3NbaV07CisJICBwYXJtX3R5 cGUgPSBub2RlLT5kZWZpbml0aW9uID8gVFJFRV9UWVBFIChwYXJtKSA6IHBhcm07CisJfQor CiAgICAgICBhZGouYmFzZV9pbmRleCA9IGk7CiAgICAgICBhZGoucHJldl9jbG9uZV9pbmRl eCA9IGk7CiAKQEAgLTE1NDcsNyArMTU1Myw3IEBAIHNpbWRfY2xvbmVfYWRqdXN0IChzdHJ1 Y3QgY2dyYXBoX25vZGUgKm5vZGUpCiAJICBtYXNrID0gZ2ltcGxlX2Fzc2lnbl9saHMgKGcp OwogCSAgZyA9IGdpbXBsZV9idWlsZF9hc3NpZ24gKG1ha2Vfc3NhX25hbWUgKFRSRUVfVFlQ RSAobWFzaykpLAogCQkJCSAgIEJJVF9BTkRfRVhQUiwgbWFzaywKLQkJCQkgICBidWlsZF9p bnRfY3N0IChUUkVFX1RZUEUgKG1hc2spLCAxKSk7CisJCQkJICAgYnVpbGRfb25lX2NzdCAo VFJFRV9UWVBFIChtYXNrKSkpOwogCSAgZ3NpX2luc2VydF9hZnRlciAoJmdzaSwgZywgR1NJ X0NPTlRJTlVFX0xJTktJTkcpOwogCSAgbWFzayA9IGdpbXBsZV9hc3NpZ25fbGhzIChnKTsK IAl9CmRpZmYgLS1naXQgYS9nY2MvdHJlZS12ZWN0LXN0bXRzLmNjIGIvZ2NjL3RyZWUtdmVj dC1zdG10cy5jYwppbmRleCA3MzFhY2M3NjM1MGNhZTM5Yzg5OWE4NjY1ODQwNjhjZmYyNDcx ODNhLi42ZTJjNzBjMWQzOTcwYWY2NTJjMWU1MGU0MWIxNDQxNjI4ODRiZjI0IDEwMDY0NAot LS0gYS9nY2MvdHJlZS12ZWN0LXN0bXRzLmNjCisrKyBiL2djYy90cmVlLXZlY3Qtc3RtdHMu Y2MKQEAgLTE1OTQsNiArMTU5NCwyMCBAQCBjaGVja19sb2FkX3N0b3JlX2Zvcl9wYXJ0aWFs X3ZlY3RvcnMgKGxvb3BfdmVjX2luZm8gbG9vcF92aW5mbywgdHJlZSB2ZWN0eXBlLAogICAg IH0KIH0KIAorLyogUmV0dXJuIFNTQSBuYW1lIG9mIHRoZSByZXN1bHQgb2YgdGhlIGNvbnZl cnNpb24gb2YgT1BFUkFORCBpbnRvIHR5cGUgVFlQRS4KKyAgIFRoZSBjb252ZXJzaW9uIHN0 YXRlbWVudCBpcyBpbnNlcnRlZCBhdCBHU0kuICAqLworCitzdGF0aWMgdHJlZQordmVjdF9j b252ZXJ0ICh2ZWNfaW5mbyAqdmluZm8sIHN0bXRfdmVjX2luZm8gc3RtdF9pbmZvLCB0cmVl IHR5cGUsIHRyZWUgb3BlcmFuZCwKKwkgICAgICBnaW1wbGVfc3RtdF9pdGVyYXRvciAqZ3Np KQoreworICBvcGVyYW5kID0gYnVpbGQxIChWSUVXX0NPTlZFUlRfRVhQUiwgdHlwZSwgb3Bl cmFuZCk7CisgIGdhc3NpZ24gKm5ld19zdG10ID0gZ2ltcGxlX2J1aWxkX2Fzc2lnbiAobWFr ZV9zc2FfbmFtZSAodHlwZSksCisJCQkJCSAgIG9wZXJhbmQpOworICB2ZWN0X2ZpbmlzaF9z dG10X2dlbmVyYXRpb24gKHZpbmZvLCBzdG10X2luZm8sIG5ld19zdG10LCBnc2kpOworICBy ZXR1cm4gZ2ltcGxlX2dldF9saHMgKG5ld19zdG10KTsKK30KKwogLyogUmV0dXJuIHRoZSBt YXNrIGlucHV0IHRvIGEgbWFza2VkIGxvYWQgb3Igc3RvcmUuICBWRUNfTUFTSyBpcyB0aGUg dmVjdG9yaXplZAogICAgZm9ybSBvZiB0aGUgc2NhbGFyIG1hc2sgY29uZGl0aW9uIGFuZCBM T09QX01BU0ssIGlmIG5vbm51bGwsIGlzIHRoZSBtYXNrCiAgICB0aGF0IG5lZWRzIHRvIGJl IGFwcGxpZWQgdG8gYWxsIGxvYWRzIGFuZCBzdG9yZXMgaW4gYSB2ZWN0b3JpemVkIGxvb3Au CkBAIC0yNTQ3LDcgKzI1NjEsOCBAQCB2ZWN0X2J1aWxkX2FsbF9vbmVzX21hc2sgKHZlY19p bmZvICp2aW5mbywKIHsKICAgaWYgKFRSRUVfQ09ERSAobWFza3R5cGUpID09IElOVEVHRVJf VFlQRSkKICAgICByZXR1cm4gYnVpbGRfaW50X2NzdCAobWFza3R5cGUsIC0xKTsKLSAgZWxz ZSBpZiAoVFJFRV9DT0RFIChUUkVFX1RZUEUgKG1hc2t0eXBlKSkgPT0gSU5URUdFUl9UWVBF KQorICBlbHNlIGlmIChWRUNUT1JfQk9PTEVBTl9UWVBFX1AgKG1hc2t0eXBlKQorCSAgIHx8 IFRSRUVfQ09ERSAoVFJFRV9UWVBFIChtYXNrdHlwZSkpID09IElOVEVHRVJfVFlQRSkKICAg ICB7CiAgICAgICB0cmVlIG1hc2sgPSBidWlsZF9pbnRfY3N0IChUUkVFX1RZUEUgKG1hc2t0 eXBlKSwgLTEpOwogICAgICAgbWFzayA9IGJ1aWxkX3ZlY3Rvcl9mcm9tX3ZhbCAobWFza3R5 cGUsIG1hc2spOwpAQCAtNDE1Niw3ICs0MTcxLDcgQEAgdmVjdG9yaXphYmxlX3NpbWRfY2xv bmVfY2FsbCAodmVjX2luZm8gKnZpbmZvLCBzdG10X3ZlY19pbmZvIHN0bXRfaW5mbywKICAg c2l6ZV90IGksIG5hcmdzOwogICB0cmVlIGxocywgcnR5cGUsIHJhdHlwZTsKICAgdmVjPGNv bnN0cnVjdG9yX2VsdCwgdmFfZ2M+ICpyZXRfY3Rvcl9lbHRzID0gTlVMTDsKLSAgaW50IGFy Z19vZmZzZXQgPSAwOworICBpbnQgbWFza2VkX2NhbGxfb2Zmc2V0ID0gMDsKIAogICAvKiBJ cyBTVE1UIGEgdmVjdG9yaXphYmxlIGNhbGw/ICAgKi8KICAgZ2NhbGwgKnN0bXQgPSBkeW5f Y2FzdCA8Z2NhbGwgKj4gKHN0bXRfaW5mby0+c3RtdCk7CkBAIC00MTcxLDcgKzQxODYsNyBA QCB2ZWN0b3JpemFibGVfc2ltZF9jbG9uZV9jYWxsICh2ZWNfaW5mbyAqdmluZm8sIHN0bXRf dmVjX2luZm8gc3RtdF9pbmZvLAogICAgICAgZ2NjX2NoZWNraW5nX2Fzc2VydCAoVFJFRV9D T0RFIChmbmRlY2wpID09IEFERFJfRVhQUik7CiAgICAgICBmbmRlY2wgPSBUUkVFX09QRVJB TkQgKGZuZGVjbCwgMCk7CiAgICAgICBnY2NfY2hlY2tpbmdfYXNzZXJ0IChUUkVFX0NPREUg KGZuZGVjbCkgPT0gRlVOQ1RJT05fREVDTCk7Ci0gICAgICBhcmdfb2Zmc2V0ID0gMTsKKyAg ICAgIG1hc2tlZF9jYWxsX29mZnNldCA9IDE7CiAgICAgfQogICBpZiAoZm5kZWNsID09IE5V TExfVFJFRSkKICAgICByZXR1cm4gZmFsc2U7CkBAIC00MTk5LDcgKzQyMTQsNyBAQCB2ZWN0 b3JpemFibGVfc2ltZF9jbG9uZV9jYWxsICh2ZWNfaW5mbyAqdmluZm8sIHN0bXRfdmVjX2lu Zm8gc3RtdF9pbmZvLAogICAgIHJldHVybiBmYWxzZTsKIAogICAvKiBQcm9jZXNzIGZ1bmN0 aW9uIGFyZ3VtZW50cy4gICovCi0gIG5hcmdzID0gZ2ltcGxlX2NhbGxfbnVtX2FyZ3MgKHN0 bXQpIC0gYXJnX29mZnNldDsKKyAgbmFyZ3MgPSBnaW1wbGVfY2FsbF9udW1fYXJncyAoc3Rt dCkgLSBtYXNrZWRfY2FsbF9vZmZzZXQ7CiAKICAgLyogQmFpbCBvdXQgaWYgdGhlIGZ1bmN0 aW9uIGhhcyB6ZXJvIGFyZ3VtZW50cy4gICovCiAgIGlmIChuYXJncyA9PSAwKQpAQCAtNDIy MSw3ICs0MjM2LDcgQEAgdmVjdG9yaXphYmxlX3NpbWRfY2xvbmVfY2FsbCAodmVjX2luZm8g KnZpbmZvLCBzdG10X3ZlY19pbmZvIHN0bXRfaW5mbywKICAgICAgIHRoaXNhcmdpbmZvLm9w ID0gTlVMTF9UUkVFOwogICAgICAgdGhpc2FyZ2luZm8uc2ltZF9sYW5lX2xpbmVhciA9IGZh bHNlOwogCi0gICAgICBpbnQgb3Bfbm8gPSBpICsgYXJnX29mZnNldDsKKyAgICAgIGludCBv cF9ubyA9IGkgKyBtYXNrZWRfY2FsbF9vZmZzZXQ7CiAgICAgICBpZiAoc2xwX25vZGUpCiAJ b3Bfbm8gPSB2ZWN0X3NscF9jaGlsZF9pbmRleF9mb3Jfb3BlcmFuZCAoc3RtdCwgb3Bfbm8p OwogICAgICAgaWYgKCF2ZWN0X2lzX3NpbXBsZV91c2UgKHZpbmZvLCBzdG10X2luZm8sIHNs cF9ub2RlLApAQCAtNDMwMywxNiArNDMxOCw2IEBAIHZlY3Rvcml6YWJsZV9zaW1kX2Nsb25l X2NhbGwgKHZlY19pbmZvICp2aW5mbywgc3RtdF92ZWNfaW5mbyBzdG10X2luZm8sCiAgICAg ICBhcmdpbmZvLnF1aWNrX3B1c2ggKHRoaXNhcmdpbmZvKTsKICAgICB9CiAKLSAgaWYgKGxv b3BfdmluZm8KLSAgICAgICYmICFMT09QX1ZJTkZPX1ZFQ1RfRkFDVE9SIChsb29wX3ZpbmZv KS5pc19jb25zdGFudCAoKSkKLSAgICB7Ci0gICAgICBpZiAoZHVtcF9lbmFibGVkX3AgKCkp Ci0JZHVtcF9wcmludGZfbG9jIChNU0dfTUlTU0VEX09QVElNSVpBVElPTiwgdmVjdF9sb2Nh dGlvbiwKLQkJCSAibm90IGNvbnNpZGVyaW5nIFNJTUQgY2xvbmVzOyBub3QgeWV0IHN1cHBv cnRlZCIKLQkJCSAiIGZvciB2YXJpYWJsZS13aWR0aCB2ZWN0b3JzLlxuIik7Ci0gICAgICBy ZXR1cm4gZmFsc2U7Ci0gICAgfQotCiAgIHBvbHlfdWludDY0IHZmID0gbG9vcF92aW5mbyA/ IExPT1BfVklORk9fVkVDVF9GQUNUT1IgKGxvb3BfdmluZm8pIDogMTsKICAgdW5zaWduZWQg Z3JvdXBfc2l6ZSA9IHNscF9ub2RlID8gU0xQX1RSRUVfTEFORVMgKHNscF9ub2RlKSA6IDE7 CiAgIHVuc2lnbmVkIGludCBiYWRuZXNzID0gMDsKQEAgLTQzMjUsOSArNDMzMCwxMCBAQCB2 ZWN0b3JpemFibGVfc2ltZF9jbG9uZV9jYWxsICh2ZWNfaW5mbyAqdmluZm8sIHN0bXRfdmVj X2luZm8gc3RtdF9pbmZvLAogICAgICAgewogCXVuc2lnbmVkIGludCB0aGlzX2JhZG5lc3Mg PSAwOwogCXVuc2lnbmVkIGludCBudW1fY2FsbHM7Ci0JaWYgKCFjb25zdGFudF9tdWx0aXBs ZV9wICh2ZiAqIGdyb3VwX3NpemUsCi0JCQkJICBuLT5zaW1kY2xvbmUtPnNpbWRsZW4sICZu dW1fY2FsbHMpCi0JICAgIHx8IG4tPnNpbWRjbG9uZS0+bmFyZ3MgIT0gbmFyZ3MpCisJaWYg KCFjb25zdGFudF9tdWx0aXBsZV9wICh2ZiAqIGdyb3VwX3NpemUsIG4tPnNpbWRjbG9uZS0+ c2ltZGxlbiwKKwkJCQkgICZudW1fY2FsbHMpCisJICAgIHx8ICghbi0+c2ltZGNsb25lLT5p bmJyYW5jaCAmJiAobWFza2VkX2NhbGxfb2Zmc2V0ID4gMCkpCisJICAgIHx8IG5hcmdzICE9 IG4tPnNpbWRjbG9uZS0+bmFyZ3MpCiAJICBjb250aW51ZTsKIAlpZiAobnVtX2NhbGxzICE9 IDEpCiAJICB0aGlzX2JhZG5lc3MgKz0gZXhhY3RfbG9nMiAobnVtX2NhbGxzKSAqIDQwOTY7 CkBAIC00MzQ0LDcgKzQzNTAsOCBAQCB2ZWN0b3JpemFibGVfc2ltZF9jbG9uZV9jYWxsICh2 ZWNfaW5mbyAqdmluZm8sIHN0bXRfdmVjX2luZm8gc3RtdF9pbmZvLAogCSAgICAgIGNhc2Ug U0lNRF9DTE9ORV9BUkdfVFlQRV9WRUNUT1I6CiAJCWlmICghdXNlbGVzc190eXBlX2NvbnZl cnNpb25fcAogCQkJKG4tPnNpbWRjbG9uZS0+YXJnc1tpXS5vcmlnX3R5cGUsCi0JCQkgVFJF RV9UWVBFIChnaW1wbGVfY2FsbF9hcmcgKHN0bXQsIGkgKyBhcmdfb2Zmc2V0KSkpKQorCQkJ IFRSRUVfVFlQRSAoZ2ltcGxlX2NhbGxfYXJnIChzdG10LAorCQkJCQkJICAgICBpICsgbWFz a2VkX2NhbGxfb2Zmc2V0KSkpKQogCQkgIGkgPSAtMTsKIAkJZWxzZSBpZiAoYXJnaW5mb1tp XS5kdCA9PSB2ZWN0X2NvbnN0YW50X2RlZgogCQkJIHx8IGFyZ2luZm9baV0uZHQgPT0gdmVj dF9leHRlcm5hbF9kZWYKQEAgLTQzOTIsNiArNDM5OSwxNyBAQCB2ZWN0b3JpemFibGVfc2lt ZF9jbG9uZV9jYWxsICh2ZWNfaW5mbyAqdmluZm8sIHN0bXRfdmVjX2luZm8gc3RtdF9pbmZv LAogCSAgfQogCWlmIChpID09IChzaXplX3QpIC0xKQogCSAgY29udGludWU7CisJaWYgKG1h c2tlZF9jYWxsX29mZnNldCA9PSAwCisJICAgICYmIG4tPnNpbWRjbG9uZS0+aW5icmFuY2gK KwkgICAgJiYgbi0+c2ltZGNsb25lLT5uYXJncyA+IG5hcmdzKQorCSAgeworCSAgICBnY2Nf YXNzZXJ0IChuLT5zaW1kY2xvbmUtPmFyZ3Nbbi0+c2ltZGNsb25lLT5uYXJncyAtIDFdLmFy Z190eXBlID09CisJCQlTSU1EX0NMT05FX0FSR19UWVBFX01BU0spOworCSAgICAvKiBQZW5h bGl6ZSB1c2luZyBhIG1hc2tlZCBTSU1EIGNsb25lIGluIGEgbm9uLW1hc2tlZCBsb29wLCB0 aGF0IGlzCisJICAgICAgIG5vdCBpbiBhIGJyYW5jaCwgYXMgd2UnZCBoYXZlIHRvIGNvbnN0 cnVjdCBhbiBhbGwtdHJ1ZSBtYXNrLiAgKi8KKwkgICAgaWYgKCFsb29wX3ZpbmZvIHx8ICFM T09QX1ZJTkZPX0ZVTExZX01BU0tFRF9QIChsb29wX3ZpbmZvKSkKKwkgICAgICB0aGlzX2Jh ZG5lc3MgKz0gNjQ7CisJICB9CiAJaWYgKGJlc3RuID09IE5VTEwgfHwgdGhpc19iYWRuZXNz IDwgYmFkbmVzcykKIAkgIHsKIAkgICAgYmVzdG4gPSBuOwpAQCAtNDQxNCw3ICs0NDMyLDgg QEAgdmVjdG9yaXphYmxlX3NpbWRfY2xvbmVfY2FsbCAodmVjX2luZm8gKnZpbmZvLCBzdG10 X3ZlY19pbmZvIHN0bXRfaW5mbywKIAkgICB8fCBhcmdpbmZvW2ldLmR0ID09IHZlY3RfZXh0 ZXJuYWxfZGVmKQogCSAgJiYgYmVzdG4tPnNpbWRjbG9uZS0+YXJnc1tpXS5hcmdfdHlwZSA9 PSBTSU1EX0NMT05FX0FSR19UWVBFX1ZFQ1RPUikKIAl7Ci0JICB0cmVlIGFyZ190eXBlID0g VFJFRV9UWVBFIChnaW1wbGVfY2FsbF9hcmcgKHN0bXQsIGkgKyBhcmdfb2Zmc2V0KSk7CisJ ICB0cmVlIGFyZ190eXBlID0gVFJFRV9UWVBFIChnaW1wbGVfY2FsbF9hcmcgKHN0bXQsCisJ CQkJCQkgICAgICBpICsgbWFza2VkX2NhbGxfb2Zmc2V0KSk7CiAJICBhcmdpbmZvW2ldLnZl Y3R5cGUgPSBnZXRfdmVjdHlwZV9mb3Jfc2NhbGFyX3R5cGUgKHZpbmZvLCBhcmdfdHlwZSwK IAkJCQkJCQkgICAgc2xwX25vZGUpOwogCSAgaWYgKGFyZ2luZm9baV0udmVjdHlwZSA9PSBO VUxMCkBAIC00NTIzLDIyICs0NTQyLDM3IEBAIHZlY3Rvcml6YWJsZV9zaW1kX2Nsb25lX2Nh bGwgKHZlY19pbmZvICp2aW5mbywgc3RtdF92ZWNfaW5mbyBzdG10X2luZm8sCiAgICAgICBp ZiAoZ2ltcGxlX3Z1c2UgKHN0bXQpICYmIHNscF9ub2RlKQogCXZpbmZvLT5hbnlfa25vd25f bm90X3VwZGF0ZWRfdnNzYSA9IHRydWU7CiAgICAgICBzaW1kX2Nsb25lX2luZm8uc2FmZV9w dXNoIChiZXN0bi0+ZGVjbCk7Ci0gICAgICBmb3IgKGkgPSAwOyBpIDwgbmFyZ3M7IGkrKykK LQlpZiAoKGJlc3RuLT5zaW1kY2xvbmUtPmFyZ3NbaV0uYXJnX3R5cGUKLQkgICAgID09IFNJ TURfQ0xPTkVfQVJHX1RZUEVfTElORUFSX0NPTlNUQU5UX1NURVApCi0JICAgIHx8IChiZXN0 bi0+c2ltZGNsb25lLT5hcmdzW2ldLmFyZ190eXBlCi0JCT09IFNJTURfQ0xPTkVfQVJHX1RZ UEVfTElORUFSX1JFRl9DT05TVEFOVF9TVEVQKSkKLQkgIHsKLQkgICAgc2ltZF9jbG9uZV9p bmZvLnNhZmVfZ3Jvd19jbGVhcmVkIChpICogMyArIDEsIHRydWUpOwotCSAgICBzaW1kX2Ns b25lX2luZm8uc2FmZV9wdXNoIChhcmdpbmZvW2ldLm9wKTsKLQkgICAgdHJlZSBsc3QgPSBQ T0lOVEVSX1RZUEVfUCAoVFJFRV9UWVBFIChhcmdpbmZvW2ldLm9wKSkKLQkJICAgICAgID8g c2l6ZV90eXBlX25vZGUgOiBUUkVFX1RZUEUgKGFyZ2luZm9baV0ub3ApOwotCSAgICB0cmVl IGxzID0gYnVpbGRfaW50X2NzdCAobHN0LCBhcmdpbmZvW2ldLmxpbmVhcl9zdGVwKTsKLQkg ICAgc2ltZF9jbG9uZV9pbmZvLnNhZmVfcHVzaCAobHMpOwotCSAgICB0cmVlIHNsbCA9IGFy Z2luZm9baV0uc2ltZF9sYW5lX2xpbmVhcgotCQkgICAgICAgPyBib29sZWFuX3RydWVfbm9k ZSA6IGJvb2xlYW5fZmFsc2Vfbm9kZTsKLQkgICAgc2ltZF9jbG9uZV9pbmZvLnNhZmVfcHVz aCAoc2xsKTsKLQkgIH0KKyAgICAgIGZvciAoaSA9IDA7IGkgPCBiZXN0bi0+c2ltZGNsb25l LT5uYXJnczsgaSsrKQorCXsKKwkgIHN3aXRjaCAoYmVzdG4tPnNpbWRjbG9uZS0+YXJnc1tp XS5hcmdfdHlwZSkKKwkgICAgeworCSAgICBkZWZhdWx0OgorCSAgICAgIGNvbnRpbnVlOwor CSAgICBjYXNlIFNJTURfQ0xPTkVfQVJHX1RZUEVfTElORUFSX0NPTlNUQU5UX1NURVA6CisJ ICAgIGNhc2UgU0lNRF9DTE9ORV9BUkdfVFlQRV9MSU5FQVJfUkVGX0NPTlNUQU5UX1NURVA6 CisJICAgICAgeworCQlhdXRvICZjbG9uZV9pbmZvID0gU1RNVF9WSU5GT19TSU1EX0NMT05F X0lORk8gKHN0bXRfaW5mbyk7CisJCWNsb25lX2luZm8uc2FmZV9ncm93X2NsZWFyZWQgKGkg KiAzICsgMSwgdHJ1ZSk7CisJCWNsb25lX2luZm8uc2FmZV9wdXNoIChhcmdpbmZvW2ldLm9w KTsKKwkJdHJlZSBsc3QgPSBQT0lOVEVSX1RZUEVfUCAoVFJFRV9UWVBFIChhcmdpbmZvW2ld Lm9wKSkKKwkJCSAgID8gc2l6ZV90eXBlX25vZGUgOiBUUkVFX1RZUEUgKGFyZ2luZm9baV0u b3ApOworCQl0cmVlIGxzID0gYnVpbGRfaW50X2NzdCAobHN0LCBhcmdpbmZvW2ldLmxpbmVh cl9zdGVwKTsKKwkJY2xvbmVfaW5mby5zYWZlX3B1c2ggKGxzKTsKKwkJdHJlZSBzbGwgPSBh cmdpbmZvW2ldLnNpbWRfbGFuZV9saW5lYXIKKwkJCSAgID8gYm9vbGVhbl90cnVlX25vZGUg OiBib29sZWFuX2ZhbHNlX25vZGU7CisJCWNsb25lX2luZm8uc2FmZV9wdXNoIChzbGwpOwor CSAgICAgIH0KKwkgICAgICBicmVhazsKKwkgICAgY2FzZSBTSU1EX0NMT05FX0FSR19UWVBF X01BU0s6CisJICAgICAgaWYgKGxvb3BfdmluZm8KKwkJICAmJiBMT09QX1ZJTkZPX0NBTl9V U0VfUEFSVElBTF9WRUNUT1JTX1AgKGxvb3BfdmluZm8pKQorCQl2ZWN0X3JlY29yZF9sb29w X21hc2sgKGxvb3BfdmluZm8sCisJCQkJICAgICAgICZMT09QX1ZJTkZPX01BU0tTIChsb29w X3ZpbmZvKSwKKwkJCQkgICAgICAgbmNvcGllcywgdmVjdHlwZSwgb3ApOworCisJICAgICAg YnJlYWs7CisJICAgIH0KKwl9CiAKICAgICAgIGlmICghYmVzdG4tPnNpbWRjbG9uZS0+aW5i cmFuY2ggJiYgbG9vcF92aW5mbykKIAl7CkBAIC00NTkwLDYgKzQ2MjQsOCBAQCB2ZWN0b3Jp emFibGVfc2ltZF9jbG9uZV9jYWxsICh2ZWNfaW5mbyAqdmluZm8sIHN0bXRfdmVjX2luZm8g c3RtdF9pbmZvLAogICAgIHZlY19vcHJuZHMuc2FmZV9ncm93X2NsZWFyZWQgKG5hcmdzLCB0 cnVlKTsKICAgZm9yIChqID0gMDsgaiA8IG5jb3BpZXM7ICsraikKICAgICB7CisgICAgICBw b2x5X3VpbnQ2NCBjYWxsZWVfbmVsZW1lbnRzOworICAgICAgcG9seV91aW50NjQgY2FsbGVy X25lbGVtZW50czsKICAgICAgIC8qIEJ1aWxkIGFyZ3VtZW50IGxpc3QgZm9yIHRoZSB2ZWN0 b3JpemVkIGNhbGwuICAqLwogICAgICAgaWYgKGogPT0gMCkKIAl2YXJncy5jcmVhdGUgKG5h cmdzKTsKQEAgLTQ2MDAsOCArNDYzNiw3IEBAIHZlY3Rvcml6YWJsZV9zaW1kX2Nsb25lX2Nh bGwgKHZlY19pbmZvICp2aW5mbywgc3RtdF92ZWNfaW5mbyBzdG10X2luZm8sCiAJewogCSAg dW5zaWduZWQgaW50IGssIGwsIG0sIG87CiAJICB0cmVlIGF0eXBlOwotCSAgcG9seV91aW50 NjQgY2FsbGVlX25lbGVtZW50cywgY2FsbGVyX25lbGVtZW50czsKLQkgIG9wID0gZ2ltcGxl X2NhbGxfYXJnIChzdG10LCBpICsgYXJnX29mZnNldCk7CisJICBvcCA9IGdpbXBsZV9jYWxs X2FyZyAoc3RtdCwgaSArIG1hc2tlZF9jYWxsX29mZnNldCk7CiAJICBzd2l0Y2ggKGJlc3Ru LT5zaW1kY2xvbmUtPmFyZ3NbaV0uYXJnX3R5cGUpCiAJICAgIHsKIAkgICAgY2FzZSBTSU1E X0NMT05FX0FSR19UWVBFX1ZFQ1RPUjoKQEAgLTQ2ODAsMTYgKzQ3MTUsOSBAQCB2ZWN0b3Jp emFibGVfc2ltZF9jbG9uZV9jYWxsICh2ZWNfaW5mbyAqdmluZm8sIHN0bXRfdmVjX2luZm8g c3RtdF9pbmZvLAogCQkgICAgICBpZiAoayA9PSAxKQogCQkJaWYgKCF1c2VsZXNzX3R5cGVf Y29udmVyc2lvbl9wIChUUkVFX1RZUEUgKHZlY19vcHJuZDApLAogCQkJCQkJICAgICAgIGF0 eXBlKSkKLQkJCSAgewotCQkJICAgIHZlY19vcHJuZDAKLQkJCSAgICAgID0gYnVpbGQxIChW SUVXX0NPTlZFUlRfRVhQUiwgYXR5cGUsIHZlY19vcHJuZDApOwotCQkJICAgIGdhc3NpZ24g Km5ld19zdG10Ci0JCQkgICAgICA9IGdpbXBsZV9idWlsZF9hc3NpZ24gKG1ha2Vfc3NhX25h bWUgKGF0eXBlKSwKLQkJCQkJCSAgICAgdmVjX29wcm5kMCk7Ci0JCQkgICAgdmVjdF9maW5p c2hfc3RtdF9nZW5lcmF0aW9uICh2aW5mbywgc3RtdF9pbmZvLAotCQkJCQkJCSBuZXdfc3Rt dCwgZ3NpKTsKLQkJCSAgICB2YXJncy5zYWZlX3B1c2ggKGdpbXBsZV9hc3NpZ25fbGhzIChu ZXdfc3RtdCkpOwotCQkJICB9CisJCQkgIHZhcmdzLnNhZmVfcHVzaCAodmVjdF9jb252ZXJ0 ICh2aW5mbywgc3RtdF9pbmZvLAorCQkJCQkJCSBhdHlwZSwgdmVjX29wcm5kMCwKKwkJCQkJ CQkgZ3NpKSk7CiAJCQllbHNlCiAJCQkgIHZhcmdzLnNhZmVfcHVzaCAodmVjX29wcm5kMCk7 CiAJCSAgICAgIGVsc2UKQEAgLTQ3MzgsNiArNDc2NiwyNCBAQCB2ZWN0b3JpemFibGVfc2lt ZF9jbG9uZV9jYWxsICh2ZWNfaW5mbyAqdmluZm8sIHN0bXRfdmVjX2luZm8gc3RtdF9pbmZv LAogCQkJICAgICAgdmVjX29wcm5kc19pW2ldID0gMDsKIAkJCSAgICB9CiAJCQkgIHZlY19v cHJuZDAgPSB2ZWNfb3BybmRzW2ldW3ZlY19vcHJuZHNfaVtpXSsrXTsKKwkJCSAgaWYgKGxv b3BfdmluZm8KKwkJCSAgICAgICYmIExPT1BfVklORk9fRlVMTFlfTUFTS0VEX1AgKGxvb3Bf dmluZm8pKQorCQkJICAgIHsKKwkJCSAgICAgIHZlY19sb29wX21hc2tzICpsb29wX21hc2tz CisJCQkJPSAmTE9PUF9WSU5GT19NQVNLUyAobG9vcF92aW5mbyk7CisJCQkgICAgICB0cmVl IGxvb3BfbWFzaworCQkJCT0gdmVjdF9nZXRfbG9vcF9tYXNrIChsb29wX3ZpbmZvLCBnc2ks CisJCQkJCQkgICAgICBsb29wX21hc2tzLCBuY29waWVzLAorCQkJCQkJICAgICAgdmVjdHlw ZSwgaik7CisJCQkgICAgICB2ZWNfb3BybmQwCisJCQkJPSBwcmVwYXJlX3ZlY19tYXNrIChs b29wX3ZpbmZvLAorCQkJCQkJICAgIFRSRUVfVFlQRSAobG9vcF9tYXNrKSwKKwkJCQkJCSAg ICBsb29wX21hc2ssIHZlY19vcHJuZDAsCisJCQkJCQkgICAgZ3NpKTsKKwkJCSAgICAgIGxv b3BfdmluZm8tPnZlY19jb25kX21hc2tlZF9zZXQuYWRkICh7IHZlY19vcHJuZDAsCisJCQkJ CQkJCSAgICAgbG9vcF9tYXNrIH0pOworCisJCQkgICAgfQogCQkJICB2ZWNfb3BybmQwCiAJ CQkgICAgPSBidWlsZDMgKFZFQ19DT05EX0VYUFIsIGF0eXBlLCB2ZWNfb3BybmQwLAogCQkJ CSAgICAgIGJ1aWxkX3ZlY3Rvcl9mcm9tX3ZhbCAoYXR5cGUsIG9uZSksCkBAIC00OTAxLDYg KzQ5NDcsNjQgQEAgdmVjdG9yaXphYmxlX3NpbWRfY2xvbmVfY2FsbCAodmVjX2luZm8gKnZp bmZvLCBzdG10X3ZlY19pbmZvIHN0bXRfaW5mbywKIAkgICAgfQogCX0KIAorICAgICAgaWYg KG1hc2tlZF9jYWxsX29mZnNldCA9PSAwCisJICAmJiBiZXN0bi0+c2ltZGNsb25lLT5pbmJy YW5jaAorCSAgJiYgYmVzdG4tPnNpbWRjbG9uZS0+bmFyZ3MgPiBuYXJncykKKwl7CisJICB1 bnNpZ25lZCBsb25nIG0sIG87CisJICBzaXplX3QgbWFza19pID0gYmVzdG4tPnNpbWRjbG9u ZS0+bmFyZ3MgLSAxOworCSAgdHJlZSBtYXNrOworCSAgZ2NjX2Fzc2VydCAoYmVzdG4tPnNp bWRjbG9uZS0+YXJnc1ttYXNrX2ldLmFyZ190eXBlID09CisJCSAgICAgIFNJTURfQ0xPTkVf QVJHX1RZUEVfTUFTSyk7CisKKwkgIHRyZWUgbWFza3R5cGUgPSBiZXN0bi0+c2ltZGNsb25l LT5hcmdzW21hc2tfaV0udmVjdG9yX3R5cGU7CisJICBjYWxsZWVfbmVsZW1lbnRzID0gVFlQ RV9WRUNUT1JfU1VCUEFSVFMgKG1hc2t0eXBlKTsKKwkgIG8gPSB2ZWN0b3JfdW5yb2xsX2Zh Y3RvciAobnVuaXRzLCBjYWxsZWVfbmVsZW1lbnRzKTsKKwkgIGZvciAobSA9IGogKiBvOyBt IDwgKGogKyAxKSAqIG87IG0rKykKKwkgICAgeworCSAgICAgIGlmIChsb29wX3ZpbmZvICYm IExPT1BfVklORk9fRlVMTFlfTUFTS0VEX1AgKGxvb3BfdmluZm8pKQorCQl7CisJCSAgdmVj X2xvb3BfbWFza3MgKmxvb3BfbWFza3MgPSAmTE9PUF9WSU5GT19NQVNLUyAobG9vcF92aW5m byk7CisJCSAgbWFzayA9IHZlY3RfZ2V0X2xvb3BfbWFzayAobG9vcF92aW5mbywgZ3NpLCBs b29wX21hc2tzLAorCQkJCQkgICAgIG5jb3BpZXMsIHZlY3R5cGUsIGopOworCQl9CisJICAg ICAgZWxzZQorCQltYXNrID0gdmVjdF9idWlsZF9hbGxfb25lc19tYXNrICh2aW5mbywgc3Rt dF9pbmZvLCBtYXNrdHlwZSk7CisKKwkgICAgICBpZiAoIXVzZWxlc3NfdHlwZV9jb252ZXJz aW9uX3AgKFRSRUVfVFlQRSAobWFzayksIG1hc2t0eXBlKSkKKwkJeworCQkgIGdhc3NpZ24g Km5ld19zdG10OworCQkgIGlmIChiZXN0bi0+c2ltZGNsb25lLT5tYXNrX21vZGUgIT0gVk9J RG1vZGUpCisJCSAgICB7CisJCSAgICAgIC8qIFRoaXMgbWVhbnMgd2UgYXJlIGRlYWxpbmcg d2l0aCBpbnRlZ2VyIG1hc2sgbW9kZXMuCisJCQkgRmlyc3QgY29udmVydCB0byBhbiBpbnRl Z2VyIHR5cGUgd2l0aCB0aGUgc2FtZSBzaXplIGFzCisJCQkgdGhlIGN1cnJlbnQgdmVjdG9y IHR5cGUuICAqLworCQkgICAgICB1bnNpZ25lZCBIT1NUX1dJREVfSU5UIGludGVybWVkaWF0 ZV9zaXplCisJCQk9IHRyZWVfdG9fdWh3aSAoVFlQRV9TSVpFIChUUkVFX1RZUEUgKG1hc2sp KSk7CisJCSAgICAgIHRyZWUgbWlkX2ludF90eXBlID0KKwkJCWJ1aWxkX25vbnN0YW5kYXJk X2ludGVnZXJfdHlwZSAoaW50ZXJtZWRpYXRlX3NpemUsIDEpOworCQkgICAgICBtYXNrID0g YnVpbGQxIChWSUVXX0NPTlZFUlRfRVhQUiwgbWlkX2ludF90eXBlLCBtYXNrKTsKKwkJICAg ICAgbmV3X3N0bXQKKwkJCT0gZ2ltcGxlX2J1aWxkX2Fzc2lnbiAobWFrZV9zc2FfbmFtZSAo bWlkX2ludF90eXBlKSwKKwkJCQkJICAgICAgIG1hc2spOworCQkgICAgICBnc2lfaW5zZXJ0 X2JlZm9yZSAoZ3NpLCBuZXdfc3RtdCwgR1NJX1NBTUVfU1RNVCk7CisJCSAgICAgIC8qIFRo ZW4gemVyby1leHRlbmQgdG8gdGhlIG1hc2sgbW9kZS4gICovCisJCSAgICAgIG1hc2sgPSBm b2xkX2J1aWxkMSAoTk9QX0VYUFIsIG1hc2t0eXBlLAorCQkJCQkgIGdpbXBsZV9nZXRfbGhz IChuZXdfc3RtdCkpOworCQkgICAgfQorCQkgIGVsc2UKKwkJICAgIG1hc2sgPSBidWlsZDEg KFZJRVdfQ09OVkVSVF9FWFBSLCBtYXNrdHlwZSwgbWFzayk7CisKKwkJICBuZXdfc3RtdCA9 IGdpbXBsZV9idWlsZF9hc3NpZ24gKG1ha2Vfc3NhX25hbWUgKG1hc2t0eXBlKSwKKwkJCQkJ CSAgbWFzayk7CisJCSAgdmVjdF9maW5pc2hfc3RtdF9nZW5lcmF0aW9uICh2aW5mbywgc3Rt dF9pbmZvLAorCQkJCQkgICAgICAgbmV3X3N0bXQsIGdzaSk7CisJCSAgbWFzayA9IGdpbXBs ZV9hc3NpZ25fbGhzIChuZXdfc3RtdCk7CisJCX0KKwkgICAgICB2YXJncy5zYWZlX3B1c2gg KG1hc2spOworCSAgICB9CisJfQorCiAgICAgICBnY2FsbCAqbmV3X2NhbGwgPSBnaW1wbGVf YnVpbGRfY2FsbF92ZWMgKGZuZGVjbCwgdmFyZ3MpOwogICAgICAgaWYgKHZlY19kZXN0KQog CXsK --------------ijjz3xaHDvokjmdH2oZ0ARH0--