From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 79565 invoked by alias); 31 Aug 2019 14:13:30 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 79551 invoked by uid 89); 31 Aug 2019 14:13:29 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-21.1 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=H*i:sk:CAFiYyc, relying, ands X-HELO: mail-lj1-f181.google.com Received: from mail-lj1-f181.google.com (HELO mail-lj1-f181.google.com) (209.85.208.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 31 Aug 2019 14:13:27 +0000 Received: by mail-lj1-f181.google.com with SMTP id m24so8991877ljg.8 for ; Sat, 31 Aug 2019 07:13:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=69FpZgHJwyuyk3d8wU5F9bGKX0dJQQMDvgCwMapXE6A=; b=OzsjsEyIdTwOMgMf8jEWpl7B4NUI2oHwxVITd6gH9Tb21ZGcbh4bATLVe5c2bYaUCP p3rxGPV+Mm/bZZADVwpM5M7S9w4HGcQPWUDo3zsChajdNScrwTxp2xGIpv1d1a3OAswQ TCL55NFQOIMxe3xfpIkjqENZYi1TyQj0tDP7sl5wLDe0gQbUEiyTA9QuoutvsRFKf5dG yZKk2AQgAYWYCDfNAT91JVbh1OsHd80P2TlcKxRXGJo0ZzJuTUxZIdPEWy64bm5GZ9Rx a1h/gpoph/eL1ChBYQIbs6mi+8vKux6EcCTz0S8cuWfBmYpG6wPG8x7p8It36mEP/+jp wzgg== MIME-Version: 1.0 References: In-Reply-To: From: Prathamesh Kulkarni Date: Sat, 31 Aug 2019 16:56:00 -0000 Message-ID: Subject: Re: [SVE] PR86753 To: Richard Biener Cc: gcc Patches , Richard Sandiford Content-Type: multipart/mixed; boundary="00000000000088278d05916a5714" X-IsSubscribed: yes X-SW-Source: 2019-08/txt/msg02134.txt.bz2 --00000000000088278d05916a5714 Content-Type: text/plain; charset="UTF-8" Content-length: 7027 On Fri, 30 Aug 2019 at 16:15, Richard Biener wrote: > > On Wed, Aug 28, 2019 at 11:02 AM Richard Sandiford > wrote: > > > > Prathamesh Kulkarni writes: > > > On Tue, 27 Aug 2019 at 21:14, Richard Sandiford > > > wrote: > > >> > > >> Richard should have the final say, but some comments... > > >> > > >> Prathamesh Kulkarni writes: > > >> > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > > >> > index 1e2dfe5d22d..862206b3256 100644 > > >> > --- a/gcc/tree-vect-stmts.c > > >> > +++ b/gcc/tree-vect-stmts.c > > >> > @@ -1989,17 +1989,31 @@ check_load_store_masking (loop_vec_info loop_vinfo, tree vectype, > > >> > > > >> > static tree > > >> > prepare_load_store_mask (tree mask_type, tree loop_mask, tree vec_mask, > > >> > - gimple_stmt_iterator *gsi) > > >> > + gimple_stmt_iterator *gsi, tree mask, > > >> > + cond_vmask_map_type *cond_to_vec_mask) > > >> > > >> "scalar_mask" might be a better name. But maybe we should key off the > > >> vector mask after all, now that we're relying on the code having no > > >> redundancies. > > >> > > >> Passing the vinfo would be better than passing the cond_vmask_map_type > > >> directly. > > >> > > >> > { > > >> > gcc_assert (useless_type_conversion_p (mask_type, TREE_TYPE (vec_mask))); > > >> > if (!loop_mask) > > >> > return vec_mask; > > >> > > > >> > gcc_assert (TREE_TYPE (loop_mask) == mask_type); > > >> > + > > >> > + tree *slot = 0; > > >> > + if (cond_to_vec_mask) > > >> > > >> The pointer should never be null in this context. > > > Disabling check for NULL results in segfault with cond_arith_4.c because we > > > reach prepare_load_store_mask via vect_schedule_slp, called from > > > here in vect_transform_loop: > > > /* Schedule the SLP instances first, then handle loop vectorization > > > below. */ > > > if (!loop_vinfo->slp_instances.is_empty ()) > > > { > > > DUMP_VECT_SCOPE ("scheduling SLP instances"); > > > vect_schedule_slp (loop_vinfo); > > > } > > > > > > which is before bb processing loop. > > > > We want this optimisation to be applied to SLP too though. Especially > > since non-SLP will be going away at some point. > > > > But as Richard says, the problem with SLP is that the statements aren't > > traversed in block order, so I guess we can't do the on-the-fly > > redundancy elimination there... > > And the current patch AFAICS can generate wrong SSA for this reason. > > > Maybe an alternative would be to record during the analysis phase which > > scalar conditions need which loop masks. Statements that need a loop > > mask currently do: > > > > vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype); > > > > If we also pass the scalar condition, we can maintain a hash_set of > > pairs, representing the conditions that have > > loop masks applied at some point in the vectorised code. The COND_EXPR > > code can use that set to decide whether to apply the loop mask or not. > > Yeah, that sounds better. > > Note that I don't like the extra "helpers" in fold-const.c/h, they do not look > useful in general so put them into vectorizer private code. The decomposing > also doesn't look too nice, instead prepare_load_store_mask could get > such decomposed representation - possibly quite natural with the suggestion > from Richard above. Hi, Thanks for the suggestions, I have an attached updated patch, that tries to address above suggestions. With patch, we manage to use same predicate for both tests in PR, and the redundant AND ops are eliminated by fre4. I have a few doubts: 1] I moved tree_cond_ops into tree-vectorizer.[ch], I will get rid of it in follow up patch. I am not sure what to pass as def of scalar condition (scalar_mask) to vect_record_loop_mask from vectorizable_store, vectorizable_reduction and vectorizable_live_operation ? In the patch, I just passed NULL. 2] Do changes to vectorizable_condition and vectorizable_condition_apply_loop_mask look OK ? 3] The patch additionally regresses following tests (apart from fmla_2.c): FAIL: gcc.target/aarch64/sve/cond_convert_1.c -march=armv8.2-a+sve scan-assembler-not \\tsel\\t FAIL: gcc.target/aarch64/sve/cond_convert_4.c -march=armv8.2-a+sve scan-assembler-not \\tsel\\t FAIL: gcc.target/aarch64/sve/cond_unary_2.c -march=armv8.2-a+sve scan-assembler-not \\tsel\\t FAIL: gcc.target/aarch64/sve/cond_unary_2.c -march=armv8.2-a+sve scan-assembler-times \\tmovprfx\\t The issue with cond_convert_1.c, can be reproduced with following test-case: void __attribute__((noipa)) test_int16_t(_Float16 *__restrict r, int16_t *__restrict a, _Float16 *__restrict b, int16_t *__restrict pred, int n) { for (int i = 0; i < n; ++i) r[i] = pred[i] ? (_Float16)a[i] : b[i]; } Before patch, vect dump shows: mask__41.15_56 = vect__4.9_47 == vect_cst__55; _41 = _4 == 0; vec_mask_and_59 = mask__41.15_56 & loop_mask_46; vect_iftmp.18_60 = .MASK_LOAD (vectp_b.16_57, 2B, vec_mask_and_59); iftmp.0_16 = 0.0; vect_iftmp.19_62 = VEC_COND_EXPR ; iftmp.0_10 = _4 == 0 ? iftmp.0_16 : iftmp.0_18; fre4, then seems to interchange operands of vec_cond_expr with inverted code: mask__41.15_56 = vect__4.9_47 == { 0, ... }; vec_mask_and_59 = mask__41.15_56 & loop_mask_46; _1 = &MEM[base: b_15(D), index: ivtmp_66, step: 2, offset: 0B]; vect_iftmp.18_60 = .MASK_LOAD (_1, 2B, vec_mask_and_59); vect_iftmp.19_62 = VEC_COND_EXPR ; After patch, vect dump shows: mask__41.15_56 = vect__4.9_47 == vect_cst__55; _41 = _4 == 0; vec_mask_and_59 = mask__41.15_56 & loop_mask_46; vect_iftmp.18_60 = .MASK_LOAD (vectp_b.16_57, 2B, vec_mask_and_59); iftmp.0_16 = 0.0; _62 = vect__4.9_47 == vect_cst__61; _63 = _62 & loop_mask_46; vect_iftmp.19_64 = VEC_COND_EXPR <_63, vect_iftmp.18_60, vect_iftmp.14_54>; iftmp.0_10 = _4 == 0 ? iftmp.0_16 : iftmp.0_18; which is then cleaned up by fre4: mask__41.15_56 = vect__4.9_47 == { 0, ... }; vec_mask_and_59 = mask__41.15_56 & loop_mask_46; _1 = &MEM[base: b_15(D), index: ivtmp_68, step: 2, offset: 0B]; vect_iftmp.18_60 = .MASK_LOAD (_1, 2B, vec_mask_and_59); vect_iftmp.19_64 = VEC_COND_EXPR ; In this case, fre4 does not interchange the operands, and reuses vec_mask_and_59 in vec_cond_expr, which perhaps results in different code-gen ? I didn't investigate other tests so far, because they look quite similar to cond_convert_1.c, and possibly have the same issue. Thanks, Prathamesh > > Richard. > > > Trying to avoid duplicate ANDs with the loop mask would then become a > > separate follow-on change. Not sure whether it's worth it on its own. > > > > Thanks, > > Richard --00000000000088278d05916a5714 Content-Type: application/x-patch; name="pr86753-v2-1.diff" Content-Disposition: attachment; filename="pr86753-v2-1.diff" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_jzzlr65d0 Content-length: 17040 ZGlmZiAtLWdpdCBhL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9hYXJjaDY0 L3N2ZS9mbWxhXzIuYyBiL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9hYXJj aDY0L3N2ZS9mbWxhXzIuYwppbmRleCA1YzA0YmNkYjNmNS4uYTFiMDY2N2Rh YjUgMTAwNjQ0Ci0tLSBhL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9hYXJj aDY0L3N2ZS9mbWxhXzIuYworKysgYi9nY2MvdGVzdHN1aXRlL2djYy50YXJn ZXQvYWFyY2g2NC9zdmUvZm1sYV8yLmMKQEAgLTE1LDUgKzE1LDkgQEAgZiAo ZG91YmxlICpyZXN0cmljdCBhLCBkb3VibGUgKnJlc3RyaWN0IGIsIGRvdWJs ZSAqcmVzdHJpY3QgYywKICAgICB9CiB9CiAKLS8qIHsgZGctZmluYWwgeyBz Y2FuLWFzc2VtYmxlci10aW1lcyB7XHRmbWxhXHR6WzAtOV0rXC5kLCBwWzAt N10vbSwgelswLTldK1wuZCwgelswLTldK1wuZFxufSAyIH0gfSAqLworLyog U2VlIGh0dHBzOi8vZ2NjLmdudS5vcmcvbWwvZ2NjLXBhdGNoZXMvMjAxOS0w OC9tc2cwMTY0NC5odG1sCisgICBmb3IgWEZBSUxpbmcgdGhlIGJlbG93IHRl c3QuICAqLworCisvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXItdGlt ZXMge1x0Zm1sYVx0elswLTldK1wuZCwgcFswLTddL20sIHpbMC05XStcLmQs IHpbMC05XStcLmRcbn0gMiB7IHhmYWlsICotKi0qIH0gfSB9ICovCisvKiB7 IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXItdGltZXMge1x0Zm1sYVx0elsw LTldK1wuZCwgcFswLTddL20sIHpbMC05XStcLmQsIHpbMC05XStcLmRcbn0g MyB9IH0gKi8KIC8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci1ub3Qg e1x0Zm1hZFx0fSB9IH0gKi8KZGlmZiAtLWdpdCBhL2djYy90cmVlLXZlY3Qt bG9vcC5jIGIvZ2NjL3RyZWUtdmVjdC1sb29wLmMKaW5kZXggYjBjYmJhYzBj YjUuLjFlYzA3YTIxYjRlIDEwMDY0NAotLS0gYS9nY2MvdHJlZS12ZWN0LWxv b3AuYworKysgYi9nY2MvdHJlZS12ZWN0LWxvb3AuYwpAQCAtNzE5Nyw3ICs3 MTk3LDcgQEAgdmVjdG9yaXphYmxlX3JlZHVjdGlvbiAoc3RtdF92ZWNfaW5m byBzdG10X2luZm8sIGdpbXBsZV9zdG10X2l0ZXJhdG9yICpnc2ksCiAJICAg IH0KIAkgIGVsc2UKIAkgICAgdmVjdF9yZWNvcmRfbG9vcF9tYXNrIChsb29w X3ZpbmZvLCBtYXNrcywgbmNvcGllcyAqIHZlY19udW0sCi0JCQkJICAgdmVj dHlwZV9pbik7CisJCQkJICAgdmVjdHlwZV9pbiwgMCk7CiAJfQogICAgICAg aWYgKGR1bXBfZW5hYmxlZF9wICgpCiAJICAmJiByZWR1Y3Rpb25fdHlwZSA9 PSBGT0xEX0xFRlRfUkVEVUNUSU9OKQpAQCAtODExMCw3ICs4MTEwLDcgQEAg dmVjdG9yaXphYmxlX2xpdmVfb3BlcmF0aW9uIChzdG10X3ZlY19pbmZvIHN0 bXRfaW5mbywKIAkgICAgICBnY2NfYXNzZXJ0IChuY29waWVzID09IDEgJiYg IXNscF9ub2RlKTsKIAkgICAgICB2ZWN0X3JlY29yZF9sb29wX21hc2sgKGxv b3BfdmluZm8sCiAJCQkJICAgICAmTE9PUF9WSU5GT19NQVNLUyAobG9vcF92 aW5mbyksCi0JCQkJICAgICAxLCB2ZWN0eXBlKTsKKwkJCQkgICAgIDEsIHZl Y3R5cGUsIDApOwogCSAgICB9CiAJfQogICAgICAgcmV0dXJuIHRydWU7CkBA IC04MzEzLDcgKzgzMTMsNyBAQCB2ZWN0X2RvdWJsZV9tYXNrX251bml0cyAo dHJlZSB0eXBlKQogCiB2b2lkCiB2ZWN0X3JlY29yZF9sb29wX21hc2sgKGxv b3BfdmVjX2luZm8gbG9vcF92aW5mbywgdmVjX2xvb3BfbWFza3MgKm1hc2tz LAotCQkgICAgICAgdW5zaWduZWQgaW50IG52ZWN0b3JzLCB0cmVlIHZlY3R5 cGUpCisJCSAgICAgICB1bnNpZ25lZCBpbnQgbnZlY3RvcnMsIHRyZWUgdmVj dHlwZSwgdHJlZSBzY2FsYXJfbWFzaykKIHsKICAgZ2NjX2Fzc2VydCAobnZl Y3RvcnMgIT0gMCk7CiAgIGlmIChtYXNrcy0+bGVuZ3RoICgpIDwgbnZlY3Rv cnMpCkBAIC04MzI5LDYgKzgzMjksMTIgQEAgdmVjdF9yZWNvcmRfbG9vcF9t YXNrIChsb29wX3ZlY19pbmZvIGxvb3BfdmluZm8sIHZlY19sb29wX21hc2tz ICptYXNrcywKICAgICAgIHJnbS0+bWF4X25zY2FsYXJzX3Blcl9pdGVyID0g bnNjYWxhcnNfcGVyX2l0ZXI7CiAgICAgICByZ20tPm1hc2tfdHlwZSA9IGJ1 aWxkX3NhbWVfc2l6ZWRfdHJ1dGhfdmVjdG9yX3R5cGUgKHZlY3R5cGUpOwog ICAgIH0KKworICBpZiAoc2NhbGFyX21hc2spCisgICAgeworICAgICAgc2Nh bGFyX2NvbmRfbWFza2VkX2tleSBjb25kIChzY2FsYXJfbWFzaywgbnZlY3Rv cnMpOworICAgICAgbG9vcF92aW5mby0+c2NhbGFyX2NvbmRfbWFza2VkX3Nl dC0+YWRkIChjb25kKTsKKyAgICB9CiB9CiAKIC8qIEdpdmVuIGEgY29tcGxl dGUgc2V0IG9mIG1hc2tzIE1BU0tTLCBleHRyYWN0IG1hc2sgbnVtYmVyIElO REVYCmRpZmYgLS1naXQgYS9nY2MvdHJlZS12ZWN0LXN0bXRzLmMgYi9nY2Mv dHJlZS12ZWN0LXN0bXRzLmMKaW5kZXggZGQ5ZDQ1YTk1NDcuLjQ5ZWE4NmEw NjgwIDEwMDY0NAotLS0gYS9nY2MvdHJlZS12ZWN0LXN0bXRzLmMKKysrIGIv Z2NjL3RyZWUtdmVjdC1zdG10cy5jCkBAIC0xODg4LDcgKzE4ODgsNyBAQCBz dGF0aWMgdm9pZAogY2hlY2tfbG9hZF9zdG9yZV9tYXNraW5nIChsb29wX3Zl Y19pbmZvIGxvb3BfdmluZm8sIHRyZWUgdmVjdHlwZSwKIAkJCSAgdmVjX2xv YWRfc3RvcmVfdHlwZSB2bHNfdHlwZSwgaW50IGdyb3VwX3NpemUsCiAJCQkg IHZlY3RfbWVtb3J5X2FjY2Vzc190eXBlIG1lbW9yeV9hY2Nlc3NfdHlwZSwK LQkJCSAgZ2F0aGVyX3NjYXR0ZXJfaW5mbyAqZ3NfaW5mbykKKwkJCSAgZ2F0 aGVyX3NjYXR0ZXJfaW5mbyAqZ3NfaW5mbywgdHJlZSBzY2FsYXJfbWFzaykK IHsKICAgLyogSW52YXJpYW50IGxvYWRzIG5lZWQgbm8gc3BlY2lhbCBzdXBw b3J0LiAgKi8KICAgaWYgKG1lbW9yeV9hY2Nlc3NfdHlwZSA9PSBWTUFUX0lO VkFSSUFOVCkKQEAgLTE5MTIsNyArMTkxMiw3IEBAIGNoZWNrX2xvYWRfc3Rv cmVfbWFza2luZyAobG9vcF92ZWNfaW5mbyBsb29wX3ZpbmZvLCB0cmVlIHZl Y3R5cGUsCiAJICByZXR1cm47CiAJfQogICAgICAgdW5zaWduZWQgaW50IG5j b3BpZXMgPSB2ZWN0X2dldF9udW1fY29waWVzIChsb29wX3ZpbmZvLCB2ZWN0 eXBlKTsKLSAgICAgIHZlY3RfcmVjb3JkX2xvb3BfbWFzayAobG9vcF92aW5m bywgbWFza3MsIG5jb3BpZXMsIHZlY3R5cGUpOworICAgICAgdmVjdF9yZWNv cmRfbG9vcF9tYXNrIChsb29wX3ZpbmZvLCBtYXNrcywgbmNvcGllcywgdmVj dHlwZSwgc2NhbGFyX21hc2spOwogICAgICAgcmV0dXJuOwogICAgIH0KIApA QCAtMTkzNiw3ICsxOTM2LDcgQEAgY2hlY2tfbG9hZF9zdG9yZV9tYXNraW5n IChsb29wX3ZlY19pbmZvIGxvb3BfdmluZm8sIHRyZWUgdmVjdHlwZSwKIAkg IHJldHVybjsKIAl9CiAgICAgICB1bnNpZ25lZCBpbnQgbmNvcGllcyA9IHZl Y3RfZ2V0X251bV9jb3BpZXMgKGxvb3BfdmluZm8sIHZlY3R5cGUpOwotICAg ICAgdmVjdF9yZWNvcmRfbG9vcF9tYXNrIChsb29wX3ZpbmZvLCBtYXNrcywg bmNvcGllcywgdmVjdHlwZSk7CisgICAgICB2ZWN0X3JlY29yZF9sb29wX21h c2sgKGxvb3BfdmluZm8sIG1hc2tzLCBuY29waWVzLCB2ZWN0eXBlLCBzY2Fs YXJfbWFzayk7CiAgICAgICByZXR1cm47CiAgICAgfQogCkBAIC0xOTc0LDcg KzE5NzQsNyBAQCBjaGVja19sb2FkX3N0b3JlX21hc2tpbmcgKGxvb3BfdmVj X2luZm8gbG9vcF92aW5mbywgdHJlZSB2ZWN0eXBlLAogICBwb2x5X3VpbnQ2 NCB2ZiA9IExPT1BfVklORk9fVkVDVF9GQUNUT1IgKGxvb3BfdmluZm8pOwog ICB1bnNpZ25lZCBpbnQgbnZlY3RvcnM7CiAgIGlmIChjYW5fZGl2X2F3YXlf ZnJvbV96ZXJvX3AgKGdyb3VwX3NpemUgKiB2ZiwgbnVuaXRzLCAmbnZlY3Rv cnMpKQotICAgIHZlY3RfcmVjb3JkX2xvb3BfbWFzayAobG9vcF92aW5mbywg bWFza3MsIG52ZWN0b3JzLCB2ZWN0eXBlKTsKKyAgICB2ZWN0X3JlY29yZF9s b29wX21hc2sgKGxvb3BfdmluZm8sIG1hc2tzLCBudmVjdG9ycywgdmVjdHlw ZSwgc2NhbGFyX21hc2spOwogICBlbHNlCiAgICAgZ2NjX3VucmVhY2hhYmxl ICgpOwogfQpAQCAtMzQzNiw3ICszNDM2LDkgQEAgdmVjdG9yaXphYmxlX2Nh bGwgKHN0bXRfdmVjX2luZm8gc3RtdF9pbmZvLCBnaW1wbGVfc3RtdF9pdGVy YXRvciAqZ3NpLAogCSAgdW5zaWduZWQgaW50IG52ZWN0b3JzID0gKHNscF9u b2RlCiAJCQkJICAgPyBTTFBfVFJFRV9OVU1CRVJfT0ZfVkVDX1NUTVRTIChz bHBfbm9kZSkKIAkJCQkgICA6IG5jb3BpZXMpOwotCSAgdmVjdF9yZWNvcmRf bG9vcF9tYXNrIChsb29wX3ZpbmZvLCBtYXNrcywgbnZlY3RvcnMsIHZlY3R5 cGVfb3V0KTsKKwkgIHRyZWUgc2NhbGFyX21hc2sgPSBnaW1wbGVfY2FsbF9h cmcgKHN0bXRfaW5mby0+c3RtdCwgbWFza19vcG5vKTsKKwkgIHZlY3RfcmVj b3JkX2xvb3BfbWFzayAobG9vcF92aW5mbywgbWFza3MsIG52ZWN0b3JzLAor CQkJCSB2ZWN0eXBlX291dCwgc2NhbGFyX21hc2spOwogCX0KICAgICAgIHJl dHVybiB0cnVlOwogICAgIH0KQEAgLTczOTAsNyArNzM5Miw3IEBAIHZlY3Rv cml6YWJsZV9zdG9yZSAoc3RtdF92ZWNfaW5mbyBzdG10X2luZm8sIGdpbXBs ZV9zdG10X2l0ZXJhdG9yICpnc2ksCiAgICAgICBpZiAobG9vcF92aW5mbwog CSAgJiYgTE9PUF9WSU5GT19DQU5fRlVMTFlfTUFTS19QIChsb29wX3ZpbmZv KSkKIAljaGVja19sb2FkX3N0b3JlX21hc2tpbmcgKGxvb3BfdmluZm8sIHZl Y3R5cGUsIHZsc190eXBlLCBncm91cF9zaXplLAotCQkJCSAgbWVtb3J5X2Fj Y2Vzc190eXBlLCAmZ3NfaW5mbyk7CisJCQkJICBtZW1vcnlfYWNjZXNzX3R5 cGUsICZnc19pbmZvLCAwKTsKIAogICAgICAgU1RNVF9WSU5GT19UWVBFIChz dG10X2luZm8pID0gc3RvcmVfdmVjX2luZm9fdHlwZTsKICAgICAgIHZlY3Rf bW9kZWxfc3RvcmVfY29zdCAoc3RtdF9pbmZvLCBuY29waWVzLCByaHNfZHQs IG1lbW9yeV9hY2Nlc3NfdHlwZSwKQEAgLTg2MzcsNyArODYzOSw3IEBAIHZl Y3Rvcml6YWJsZV9sb2FkIChzdG10X3ZlY19pbmZvIHN0bXRfaW5mbywgZ2lt cGxlX3N0bXRfaXRlcmF0b3IgKmdzaSwKICAgICAgIGlmIChsb29wX3ZpbmZv CiAJICAmJiBMT09QX1ZJTkZPX0NBTl9GVUxMWV9NQVNLX1AgKGxvb3Bfdmlu Zm8pKQogCWNoZWNrX2xvYWRfc3RvcmVfbWFza2luZyAobG9vcF92aW5mbywg dmVjdHlwZSwgVkxTX0xPQUQsIGdyb3VwX3NpemUsCi0JCQkJICBtZW1vcnlf YWNjZXNzX3R5cGUsICZnc19pbmZvKTsKKwkJCQkgIG1lbW9yeV9hY2Nlc3Nf dHlwZSwgJmdzX2luZm8sIG1hc2spOwogCiAgICAgICBTVE1UX1ZJTkZPX1RZ UEUgKHN0bXRfaW5mbykgPSBsb2FkX3ZlY19pbmZvX3R5cGU7CiAgICAgICB2 ZWN0X21vZGVsX2xvYWRfY29zdCAoc3RtdF9pbmZvLCBuY29waWVzLCBtZW1v cnlfYWNjZXNzX3R5cGUsCkBAIC05NzYzLDYgKzk3NjUsMjkgQEAgdmVjdF9p c19zaW1wbGVfY29uZCAodHJlZSBjb25kLCB2ZWNfaW5mbyAqdmluZm8sCiAg IHJldHVybiB0cnVlOwogfQogCitzdGF0aWMgdm9pZAordmVjdG9yaXphYmxl X2NvbmRpdGlvbl9hcHBseV9sb29wX21hc2sgKHRyZWUgJnZlY19jb21wYXJl LAorCQkJCQlnaW1wbGVfc3RtdF9pdGVyYXRvciAqJmdzaSwKKwkJCQkJc3Rt dF92ZWNfaW5mbyAmc3RtdF9pbmZvLAorCQkJCQl0cmVlIGxvb3BfbWFzaywK KwkJCQkJdHJlZSB2ZWNfY21wX3R5cGUpCit7CisgIGlmIChDT01QQVJJU09O X0NMQVNTX1AgKHZlY19jb21wYXJlKSkKKyAgICB7CisgICAgICB0cmVlIHRt cCA9IG1ha2Vfc3NhX25hbWUgKHZlY19jbXBfdHlwZSk7CisgICAgICBnYXNz aWduICpnID0gZ2ltcGxlX2J1aWxkX2Fzc2lnbiAodG1wLCBUUkVFX0NPREUg KHZlY19jb21wYXJlKSwKKwkJCQkJVFJFRV9PUEVSQU5EICh2ZWNfY29tcGFy ZSwgMCksCisJCQkJCVRSRUVfT1BFUkFORCAodmVjX2NvbXBhcmUsIDEpKTsK KyAgICAgIHZlY3RfZmluaXNoX3N0bXRfZ2VuZXJhdGlvbiAoc3RtdF9pbmZv LCBnLCBnc2kpOworICAgICAgdmVjX2NvbXBhcmUgPSB0bXA7CisgICAgfQor CisgIHRyZWUgdG1wMiA9IG1ha2Vfc3NhX25hbWUgKHZlY19jbXBfdHlwZSk7 CisgIGdhc3NpZ24gKmcgPSBnaW1wbGVfYnVpbGRfYXNzaWduICh0bXAyLCBC SVRfQU5EX0VYUFIsIHZlY19jb21wYXJlLCBsb29wX21hc2spOworICB2ZWN0 X2ZpbmlzaF9zdG10X2dlbmVyYXRpb24gKHN0bXRfaW5mbywgZywgZ3NpKTsK KyAgdmVjX2NvbXBhcmUgPSB0bXAyOworfQorCiAvKiB2ZWN0b3JpemFibGVf Y29uZGl0aW9uLgogCiAgICBDaGVjayBpZiBTVE1UX0lORk8gaXMgY29uZGl0 aW9uYWwgbW9kaWZ5IGV4cHJlc3Npb24gdGhhdCBjYW4gYmUgdmVjdG9yaXpl ZC4KQEAgLTk5NzUsNiArMTAwMDAsMzYgQEAgdmVjdG9yaXphYmxlX2NvbmRp dGlvbiAoc3RtdF92ZWNfaW5mbyBzdG10X2luZm8sIGdpbXBsZV9zdG10X2l0 ZXJhdG9yICpnc2ksCiAgIC8qIEhhbmRsZSBjb25kIGV4cHIuICAqLwogICBm b3IgKGogPSAwOyBqIDwgbmNvcGllczsgaisrKQogICAgIHsKKyAgICAgIHRy ZWUgbG9vcF9tYXNrID0gTlVMTF9UUkVFOworCisgICAgICBpZiAobG9vcF92 aW5mbyAmJiBMT09QX1ZJTkZPX0ZVTExZX01BU0tFRF9QIChsb29wX3ZpbmZv KSkKKwl7CisJICBzY2FsYXJfY29uZF9tYXNrZWRfa2V5IGNvbmQgKGNvbmRf ZXhwciwgbmNvcGllcyk7CisgICAgICAgICAgaWYgKGxvb3BfdmluZm8tPnNj YWxhcl9jb25kX21hc2tlZF9zZXQtPmNvbnRhaW5zIChjb25kKSkKKwkgICAg eworCSAgICAgIHNjYWxhcl9jb25kX21hc2tlZF9rZXkgY29uZCAoY29uZF9l eHByLCBuY29waWVzKTsKKwkgICAgICBpZiAobG9vcF92aW5mby0+c2NhbGFy X2NvbmRfbWFza2VkX3NldC0+Y29udGFpbnMgKGNvbmQpKQorCQl7CisJCSAg dmVjX2xvb3BfbWFza3MgKm1hc2tzID0gJkxPT1BfVklORk9fTUFTS1MgKGxv b3BfdmluZm8pOworCQkgIGxvb3BfbWFzayA9IHZlY3RfZ2V0X2xvb3BfbWFz ayAoZ3NpLCBtYXNrcywgbmNvcGllcywgdmVjdHlwZSwgaik7CisJCX0KKwkg ICAgfQorCSAgZWxzZQorCSAgICB7CisJICAgICAgY29uZC5jb25kX29wcy5j b2RlCisJCT0gaW52ZXJ0X3RyZWVfY29tcGFyaXNvbiAoY29uZC5jb25kX29w cy5jb2RlLCB0cnVlKTsKKwkgICAgICBpZiAobG9vcF92aW5mby0+c2NhbGFy X2NvbmRfbWFza2VkX3NldC0+Y29udGFpbnMgKGNvbmQpKQorCQl7CisJCSAg dmVjX2xvb3BfbWFza3MgKm1hc2tzID0gJkxPT1BfVklORk9fTUFTS1MgKGxv b3BfdmluZm8pOworCQkgIGxvb3BfbWFzayA9IHZlY3RfZ2V0X2xvb3BfbWFz ayAoZ3NpLCBtYXNrcywgbmNvcGllcywgdmVjdHlwZSwgaik7CisJCSAgc3Rk Ojpzd2FwICh0aGVuX2NsYXVzZSwgZWxzZV9jbGF1c2UpOworCQkgIGNvbmRf Y29kZSA9IGNvbmQuY29uZF9vcHMuY29kZTsKKwkJICBjb25kX2V4cHIgPSBi dWlsZDIgKGNvbmRfY29kZSwgVFJFRV9UWVBFIChjb25kX2V4cHIpLAorCQkJ CSAgICAgIHRoZW5fY2xhdXNlLCBlbHNlX2NsYXVzZSk7CisJCX0KKwkgICAg fQorCX0KKwogICAgICAgc3RtdF92ZWNfaW5mbyBuZXdfc3RtdF9pbmZvID0g TlVMTDsKICAgICAgIGlmIChqID09IDApCiAJewpAQCAtMTAwOTAsNiArMTAx NDUsMTEgQEAgdmVjdG9yaXphYmxlX2NvbmRpdGlvbiAoc3RtdF92ZWNfaW5m byBzdG10X2luZm8sIGdpbXBsZV9zdG10X2l0ZXJhdG9yICpnc2ksCiAJCSAg ICB9CiAJCX0KIAkgICAgfQorCisJICBpZiAobG9vcF9tYXNrKQorCSAgICB2 ZWN0b3JpemFibGVfY29uZGl0aW9uX2FwcGx5X2xvb3BfbWFzayAodmVjX2Nv bXBhcmUsIGdzaSwgc3RtdF9pbmZvLAorCQkJCQkJICAgIGxvb3BfbWFzaywg dmVjX2NtcF90eXBlKTsKKwogCSAgaWYgKHJlZHVjdGlvbl90eXBlID09IEVY VFJBQ1RfTEFTVF9SRURVQ1RJT04pCiAJICAgIHsKIAkgICAgICBpZiAoIWlz X2dpbXBsZV92YWwgKHZlY19jb21wYXJlKSkKZGlmZiAtLWdpdCBhL2djYy90 cmVlLXZlY3Rvcml6ZXIuYyBiL2djYy90cmVlLXZlY3Rvcml6ZXIuYwppbmRl eCBkYzE4MTUyNDc0NC4uNzk0ZTY1ZjAwMDcgMTAwNjQ0Ci0tLSBhL2djYy90 cmVlLXZlY3Rvcml6ZXIuYworKysgYi9nY2MvdHJlZS12ZWN0b3JpemVyLmMK QEAgLTQ2NCw2ICs0NjQsNyBAQCB2ZWNfaW5mbzo6dmVjX2luZm8gKHZlY19p bmZvOjp2ZWNfa2luZCBraW5kX2luLCB2b2lkICp0YXJnZXRfY29zdF9kYXRh X2luLAogICAgIHRhcmdldF9jb3N0X2RhdGEgKHRhcmdldF9jb3N0X2RhdGFf aW4pCiB7CiAgIHN0bXRfdmVjX2luZm9zLmNyZWF0ZSAoNTApOworICBzY2Fs YXJfY29uZF9tYXNrZWRfc2V0ID0gbmV3IHNjYWxhcl9jb25kX21hc2tlZF9z ZXRfdHlwZSAoKTsKIH0KIAogdmVjX2luZm86On52ZWNfaW5mbyAoKQpAQCAt NDc2LDYgKzQ3Nyw4IEBAIHZlY19pbmZvOjp+dmVjX2luZm8gKCkKIAogICBk ZXN0cm95X2Nvc3RfZGF0YSAodGFyZ2V0X2Nvc3RfZGF0YSk7CiAgIGZyZWVf c3RtdF92ZWNfaW5mb3MgKCk7CisgIGRlbGV0ZSBzY2FsYXJfY29uZF9tYXNr ZWRfc2V0OworICBzY2FsYXJfY29uZF9tYXNrZWRfc2V0ID0gMDsKIH0KIAog dmVjX2luZm9fc2hhcmVkOjp2ZWNfaW5mb19zaGFyZWQgKCkKQEAgLTE1MTMs MyArMTUxNiwzOCBAQCBtYWtlX3Bhc3NfaXBhX2luY3JlYXNlX2FsaWdubWVu dCAoZ2NjOjpjb250ZXh0ICpjdHh0KQogewogICByZXR1cm4gbmV3IHBhc3Nf aXBhX2luY3JlYXNlX2FsaWdubWVudCAoY3R4dCk7CiB9CisKKy8qIElmIGNv ZGUoVCkgaXMgY29tcGFyaXNvbiBvcCBvciBkZWYgb2YgY29tcGFyaXNvbiBz dG10LAorICAgZXh0cmFjdCBpdCdzIG9wZXJhbmRzLgorICAgRWxzZSByZXR1 cm4gPE5FX0VYUFIsIFQsIDA+LiAgKi8KKwordHJlZV9jb25kX29wczo6dHJl ZV9jb25kX29wcyAodHJlZSB0KQoreworICBpZiAoVFJFRV9DT0RFX0NMQVNT IChUUkVFX0NPREUgKHQpKSA9PSB0Y2NfY29tcGFyaXNvbikKKyAgICB7Cisg ICAgICB0aGlzLT5jb2RlID0gVFJFRV9DT0RFICh0KTsKKyAgICAgIHRoaXMt Pm9wMCA9IFRSRUVfT1BFUkFORCAodCwgMCk7CisgICAgICB0aGlzLT5vcDEg PSBUUkVFX09QRVJBTkQgKHQsIDEpOworICAgICAgcmV0dXJuOworICAgIH0K KworICBpZiAoVFJFRV9DT0RFICh0KSA9PSBTU0FfTkFNRSkKKyAgICB7Cisg ICAgICBnYXNzaWduICpzdG10ID0gZHluX2Nhc3Q8Z2Fzc2lnbiAqPiAoU1NB X05BTUVfREVGX1NUTVQgKHQpKTsKKyAgICAgIGlmIChzdG10KQorICAgICAg ICB7CisgICAgICAgICAgdHJlZV9jb2RlIGNvZGUgPSBnaW1wbGVfYXNzaWdu X3Joc19jb2RlIChzdG10KTsKKyAgICAgICAgICBpZiAoVFJFRV9DT0RFX0NM QVNTIChjb2RlKSA9PSB0Y2NfY29tcGFyaXNvbikKKyAgICAgICAgICAgIHsK KyAgICAgICAgICAgICAgdGhpcy0+Y29kZSA9IGNvZGU7CisgICAgICAgICAg ICAgIHRoaXMtPm9wMCA9IGdpbXBsZV9hc3NpZ25fcmhzMSAoc3RtdCk7Cisg ICAgICAgICAgICAgIHRoaXMtPm9wMSA9IGdpbXBsZV9hc3NpZ25fcmhzMiAo c3RtdCk7CisgICAgICAgICAgICAgIHJldHVybjsKKyAgICAgICAgICAgIH0K KyAgICAgICAgfQorICAgIH0KKworICB0aGlzLT5jb2RlID0gTkVfRVhQUjsK KyAgdGhpcy0+b3AwID0gdDsKKyAgdGhpcy0+b3AxID0gYnVpbGRfemVyb19j c3QgKFRSRUVfVFlQRSAodCkpOworfQpkaWZmIC0tZ2l0IGEvZ2NjL3RyZWUt dmVjdG9yaXplci5oIGIvZ2NjL3RyZWUtdmVjdG9yaXplci5oCmluZGV4IDE0 NTZjZGU0YzJjLi4wZTc0MGM0YmE3YyAxMDA2NDQKLS0tIGEvZ2NjL3RyZWUt dmVjdG9yaXplci5oCisrKyBiL2djYy90cmVlLXZlY3Rvcml6ZXIuaApAQCAt MjYsNiArMjYsNyBAQCB0eXBlZGVmIGNsYXNzIF9zdG10X3ZlY19pbmZvICpz dG10X3ZlY19pbmZvOwogI2luY2x1ZGUgInRyZWUtZGF0YS1yZWYuaCIKICNp bmNsdWRlICJ0cmVlLWhhc2gtdHJhaXRzLmgiCiAjaW5jbHVkZSAidGFyZ2V0 LmgiCisjaW5jbHVkZSAiaGFzaC1zZXQuaCIKIAogLyogVXNlZCBmb3IgbmFt aW5nIG9mIG5ldyB0ZW1wb3Jhcmllcy4gICovCiBlbnVtIHZlY3RfdmFyX2tp bmQgewpAQCAtMTc0LDcgKzE3NSw4MCBAQCBwdWJsaWM6CiAjZGVmaW5lIFNM UF9UUkVFX1RXT19PUEVSQVRPUlMoUykJCSAoUyktPnR3b19vcGVyYXRvcnMK ICNkZWZpbmUgU0xQX1RSRUVfREVGX1RZUEUoUykJCQkgKFMpLT5kZWZfdHlw ZQogCitzdHJ1Y3QgdHJlZV9jb25kX29wcworeworICB0cmVlX2NvZGUgY29k ZTsKKyAgdHJlZSBvcDA7CisgIHRyZWUgb3AxOworCisgIHRyZWVfY29uZF9v cHMgKHRyZWUpOworfTsKKworaW5saW5lIGJvb2wKK29wZXJhdG9yPT0gKGNv bnN0IHRyZWVfY29uZF9vcHMmIG8xLCBjb25zdCB0cmVlX2NvbmRfb3BzICZv MikKK3sKKyAgcmV0dXJuIChvMS5jb2RlID09IG8yLmNvZGUKKyAgICAgICAg ICAmJiBvcGVyYW5kX2VxdWFsX3AgKG8xLm9wMCwgbzIub3AwLCAwKQorICAg ICAgICAgICYmIG9wZXJhbmRfZXF1YWxfcCAobzEub3AxLCBvMi5vcDEsIDAp KTsKK30KKworc3RydWN0IHNjYWxhcl9jb25kX21hc2tlZF9rZXkKK3sKKyAg c2NhbGFyX2NvbmRfbWFza2VkX2tleSAodHJlZSB0LCB1bnNpZ25lZCBuY29w aWVzXykKKyAgICA6IGNvbmRfb3BzICh0KSwgbmNvcGllcyAobmNvcGllc18p CisgIHt9CisKKyAgdHJlZV9jb25kX29wcyBjb25kX29wczsKKyAgdW5zaWdu ZWQgbmNvcGllczsKK307CisKK3RlbXBsYXRlPD4KK3N0cnVjdCBkZWZhdWx0 X2hhc2hfdHJhaXRzPHNjYWxhcl9jb25kX21hc2tlZF9rZXk+Cit7CisgIHR5 cGVkZWYgc2NhbGFyX2NvbmRfbWFza2VkX2tleSBjb21wYXJlX3R5cGU7Cisg IHR5cGVkZWYgc2NhbGFyX2NvbmRfbWFza2VkX2tleSB2YWx1ZV90eXBlOwor CisgIHN0YXRpYyBpbmxpbmUgaGFzaHZhbF90CisgIGhhc2ggKHZhbHVlX3R5 cGUgdikKKyAgeworICAgIGluY2hhc2g6Omhhc2ggaDsKKyAgICBoLmFkZF9p bnQgKHYuY29uZF9vcHMuY29kZSk7CisgICAgaW5jaGFzaDo6YWRkX2V4cHIg KHYuY29uZF9vcHMub3AwLCBoLCAwKTsKKyAgICBpbmNoYXNoOjphZGRfZXhw ciAodi5jb25kX29wcy5vcDEsIGgsIDApOworICAgIGguYWRkX2ludCAodi5u Y29waWVzKTsKKyAgICByZXR1cm4gaC5lbmQgKCk7CisgIH0KKworICBzdGF0 aWMgaW5saW5lIGJvb2wKKyAgZXF1YWwgKHZhbHVlX3R5cGUgZXhpc3Rpbmcs IHZhbHVlX3R5cGUgY2FuZGlkYXRlKQorICB7CisgICAgcmV0dXJuIChleGlz dGluZy5jb25kX29wcyA9PSBjYW5kaWRhdGUuY29uZF9vcHMKKwkgICAgJiYg ZXhpc3RpbmcubmNvcGllcyA9PSBjYW5kaWRhdGUubmNvcGllcyk7CisgIH0K KworICBzdGF0aWMgaW5saW5lIHZvaWQKKyAgbWFya19lbXB0eSAodmFsdWVf dHlwZSAmdikKKyAgeworICAgIHYubmNvcGllcyA9IDA7CisgIH0KKworICBz dGF0aWMgaW5saW5lIGJvb2wKKyAgaXNfZW1wdHkgKHZhbHVlX3R5cGUgdikK KyAgeworICAgIHJldHVybiB2Lm5jb3BpZXMgPT0gMDsKKyAgfQorCisgIHN0 YXRpYyBpbmxpbmUgdm9pZCBtYXJrX2RlbGV0ZWQgKHZhbHVlX3R5cGUgJikg e30KKworICBzdGF0aWMgaW5saW5lIGJvb2wgaXNfZGVsZXRlZCAoY29uc3Qg dmFsdWVfdHlwZSAmKQorICB7CisgICAgcmV0dXJuIGZhbHNlOworICB9CisK KyAgc3RhdGljIGlubGluZSB2b2lkIHJlbW92ZSAodmFsdWVfdHlwZSAmKSB7 fQorfTsKIAordHlwZWRlZiBoYXNoX3NldDxzY2FsYXJfY29uZF9tYXNrZWRf a2V5PiBzY2FsYXJfY29uZF9tYXNrZWRfc2V0X3R5cGU7CiAKIC8qIERlc2Ny aWJlcyB0d28gb2JqZWN0cyB3aG9zZSBhZGRyZXNzZXMgbXVzdCBiZSB1bmVx dWFsIGZvciB0aGUgdmVjdG9yaXplZAogICAgbG9vcCB0byBiZSB2YWxpZC4g ICovCkBAIC0yNTUsNiArMzI5LDkgQEAgcHVibGljOgogICAvKiBDb3N0IGRh dGEgdXNlZCBieSB0aGUgdGFyZ2V0IGNvc3QgbW9kZWwuICAqLwogICB2b2lk ICp0YXJnZXRfY29zdF9kYXRhOwogCisgIC8qIFNldCBvZiBzY2FsYXIgY29u ZGl0aW9ucyB0aGF0IGhhdmUgbG9vcCBtYXNrIGFwcGxpZWQuICAqLworICBz Y2FsYXJfY29uZF9tYXNrZWRfc2V0X3R5cGUgKnNjYWxhcl9jb25kX21hc2tl ZF9zZXQ7CisKIHByaXZhdGU6CiAgIHN0bXRfdmVjX2luZm8gbmV3X3N0bXRf dmVjX2luZm8gKGdpbXBsZSAqc3RtdCk7CiAgIHZvaWQgc2V0X3ZpbmZvX2Zv cl9zdG10IChnaW1wbGUgKiwgc3RtdF92ZWNfaW5mbyk7CkBAIC0xNjE3LDcg KzE2OTQsNyBAQCBleHRlcm4gdm9pZCB2ZWN0X2dlbl92ZWN0b3JfbG9vcF9u aXRlcnMgKGxvb3BfdmVjX2luZm8sIHRyZWUsIHRyZWUgKiwKIGV4dGVybiB0 cmVlIHZlY3RfaGFsdmVfbWFza19udW5pdHMgKHRyZWUpOwogZXh0ZXJuIHRy ZWUgdmVjdF9kb3VibGVfbWFza19udW5pdHMgKHRyZWUpOwogZXh0ZXJuIHZv aWQgdmVjdF9yZWNvcmRfbG9vcF9tYXNrIChsb29wX3ZlY19pbmZvLCB2ZWNf bG9vcF9tYXNrcyAqLAotCQkJCSAgIHVuc2lnbmVkIGludCwgdHJlZSk7CisJ CQkJICAgdW5zaWduZWQgaW50LCB0cmVlLCB0cmVlKTsKIGV4dGVybiB0cmVl IHZlY3RfZ2V0X2xvb3BfbWFzayAoZ2ltcGxlX3N0bXRfaXRlcmF0b3IgKiwg dmVjX2xvb3BfbWFza3MgKiwKIAkJCQl1bnNpZ25lZCBpbnQsIHRyZWUsIHVu c2lnbmVkIGludCk7CiAK --00000000000088278d05916a5714--