From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62a.google.com (mail-ej1-x62a.google.com [IPv6:2a00:1450:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id 7EE99382C16C for ; Tue, 7 Jun 2022 10:47:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7EE99382C16C Received: by mail-ej1-x62a.google.com with SMTP id fu3so32858682ejc.7 for ; Tue, 07 Jun 2022 03:47:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=/LS6YuTvw5/x+R2jBhRC4/BgfHffMJRSnP7fInAGcZ0=; b=izycswDlH0vqeqnVkf3rgfuUx5U8qCoBdcXDsDE40hZmcALVzAaC40cHDp+4aF7zW4 ob/UtDdI37jpNfgRxvv+OOcsur1RAwCKDa77qRgz/sv+EmHrVzBjgojY/1gKz9C7oP4Q fLiWKgSZft5BstzeV5EMKVgH2YaoyuldWx+iH2n935SUSVvoiO+kpKb6kxi6CydYs2Rt DPkj3NhdIh/yDWjZYYT7LffM0RIl3uQkt43KhRqOkqYMLfKUWVaa1s8DcxmsvDpPf7tq Vh0g4Zt8Evubgb2JXzZOLPCxqqjgKYwaYLu8t33M5wv+gCAdoZwuB3yz3wCYQoXCNEmR vMLw== X-Gm-Message-State: AOAM532uBjmgVGQ4FY+1nMtHQ6+peJccNuKxhXmuEnjmHtPSapnGOknK CFMY4wSTSpzPZX2NkKUvaPUbwXvVnd3z831uvJQWPA== X-Google-Smtp-Source: ABdhPJwcQuQprFLcbyRzynvgo2YW55dpg+2eKYcf5qYx9UtE75O/dWjdLvU+Qeb8m6mD1vtmlbSa1oSvsOXYh1rN7UY= X-Received: by 2002:a17:906:b816:b0:708:2e56:97d7 with SMTP id dv22-20020a170906b81600b007082e5697d7mr25585802ejb.502.1654598876940; Tue, 07 Jun 2022 03:47:56 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Prathamesh Kulkarni Date: Tue, 7 Jun 2022 16:17:21 +0530 Message-ID: Subject: Re: [1/2] PR96463 - aarch64 specific changes To: Prathamesh Kulkarni , gcc Patches , richard.sandiford@arm.com Content-Type: multipart/mixed; boundary="0000000000004eaa9705e0d9520e" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jun 2022 10:48:03 -0000 --0000000000004eaa9705e0d9520e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, 6 Jun 2022 at 16:29, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > >> > { > >> > /* The pattern matching functions above are written to look for a= small > >> > number to begin the sequence (0, 1, N/2). If we begin with an= index > >> > @@ -24084,6 +24112,12 @@ aarch64_expand_vec_perm_const_1 (struct exp= and_vec_perm_d *d) > >> > || d->vec_flags =3D=3D VEC_SVE_PRED) > >> > && known_gt (nelt, 1)) > >> > { > >> > + /* If operand and result modes differ, then only check > >> > + for dup case. */ > >> > + if (d->vmode !=3D op_mode) > >> > + return (d->vec_flags =3D=3D VEC_SVE_DATA) > >> > + ? aarch64_evpc_sve_dup (d, op_mode) : false; > >> > + > >> > >> I think it'd be more future-proof to format this as: > >> > >> if (d->vmod =3D=3D d->op_mode) > >> { > >> =E2=80=A6existing code=E2=80=A6 > >> } > >> else > >> { > >> if (aarch64_evpc_sve_dup (d)) > >> return true; > >> } > >> > >> with the d->vec_flags =3D=3D VEC_SVE_DATA check being in aarch64_evpc_= sve_dup, > >> alongside the op_mode check. I think we'll be adding more checks here > >> over time. > > Um I was wondering if we should structure it as: > > if (d->vmode =3D=3D d->op_mode) > > { > > ...existing code... > > } > > if (aarch64_evpc_sve_dup (d)) > > return true; > > > > So we check for dup irrespective of d->vmode =3D=3D d->op_mode ? > > Yeah, I can see the attraction of that. I think the else is better > though because the fallback TBL handling will (rightly) come at the end > of the existing code. Without the else, we'd have specific tests like > DUP after generic ones like TBL, so the reader would have to work out > for themselves that DUP and TBL handle disjoint cases. > > >> > if (aarch64_evpc_rev_local (d)) > >> > return true; > >> > else if (aarch64_evpc_rev_global (d)) > >> > @@ -24105,7 +24139,12 @@ aarch64_expand_vec_perm_const_1 (struct exp= and_vec_perm_d *d) > >> > else if (aarch64_evpc_reencode (d)) > >> > return true; > >> > if (d->vec_flags =3D=3D VEC_SVE_DATA) > >> > - return aarch64_evpc_sve_tbl (d); > >> > + { > >> > + if (aarch64_evpc_sve_tbl (d)) > >> > + return true; > >> > + else if (aarch64_evpc_sve_dup (d, op_mode)) > >> > + return true; > >> > + } > >> > else if (d->vec_flags =3D=3D VEC_ADVSIMD) > >> > return aarch64_evpc_tbl (d); > >> > } > >> > >> Is this part still needed, given the above? > >> > >> Thanks, > >> Richard > >> > >> > @@ -24119,9 +24158,6 @@ aarch64_vectorize_vec_perm_const (machine_mo= de vmode, machine_mode op_mode, > >> > rtx target, rtx op0, rtx op1, > >> > const vec_perm_indices &sel) > >> > { > >> > - if (vmode !=3D op_mode) > >> > - return false; > >> > - > >> > struct expand_vec_perm_d d; > >> > > >> > /* Check whether the mask can be applied to a single vector. */ > >> > @@ -24154,10 +24190,10 @@ aarch64_vectorize_vec_perm_const (machine_= mode vmode, machine_mode op_mode, > >> > d.testing_p =3D !target; > >> > > >> > if (!d.testing_p) > >> > - return aarch64_expand_vec_perm_const_1 (&d); > >> > + return aarch64_expand_vec_perm_const_1 (&d, op_mode); > >> > > >> > rtx_insn *last =3D get_last_insn (); > >> > - bool ret =3D aarch64_expand_vec_perm_const_1 (&d); > >> > + bool ret =3D aarch64_expand_vec_perm_const_1 (&d, op_mode); > >> > gcc_assert (last =3D=3D get_last_insn ()); > >> > > >> > return ret; > > > > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/conf= ig/aarch64/aarch64-sve-builtins-base.cc > > index bee410929bd..1a804b1ab73 100644 > > --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc > > +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc > > @@ -44,6 +44,7 @@ > > #include "aarch64-sve-builtins-shapes.h" > > #include "aarch64-sve-builtins-base.h" > > #include "aarch64-sve-builtins-functions.h" > > +#include "ssa.h" > > > > using namespace aarch64_sve; > > > > @@ -1207,6 +1208,64 @@ public: > > insn_code icode =3D code_for_aarch64_sve_ld1rq (e.vector_mode (0))= ; > > return e.use_contiguous_load_insn (icode); > > } > > + > > + gimple * > > + fold (gimple_folder &f) const override > > + { > > + tree arg0 =3D gimple_call_arg (f.call, 0); > > + tree arg1 =3D gimple_call_arg (f.call, 1); > > + > > + /* Transform: > > + lhs =3D svld1rq ({-1, -1, ... }, arg1) > > + into: > > + tmp =3D mem_ref [(int * {ref-all}) arg1] > > + lhs =3D vec_perm_expr. > > + on little endian target. > > + vectype is the corresponding ADVSIMD type. */ > > + > > + if (!BYTES_BIG_ENDIAN > > + && integer_all_onesp (arg0)) > > + { > > + tree lhs =3D gimple_call_lhs (f.call); > > + tree lhs_type =3D TREE_TYPE (lhs); > > + poly_uint64 lhs_len =3D TYPE_VECTOR_SUBPARTS (lhs_type); > > + tree eltype =3D TREE_TYPE (lhs_type); > > + > > + scalar_mode elmode =3D GET_MODE_INNER (TYPE_MODE (lhs_type)); > > + machine_mode vq_mode =3D aarch64_vq_mode (elmode).require (); > > + tree vectype =3D build_vector_type_for_mode (eltype, vq_mode); > > + > > + tree elt_ptr_type > > + =3D build_pointer_type_for_mode (eltype, VOIDmode, true); > > + tree zero =3D build_zero_cst (elt_ptr_type); > > + > > + /* Use element type alignment. */ > > + tree access_type > > + =3D build_aligned_type (vectype, TYPE_ALIGN (eltype)); > > + > > + tree mem_ref_lhs =3D make_ssa_name_fn (cfun, access_type, 0); > > + tree mem_ref_op =3D fold_build2 (MEM_REF, access_type, arg1, zero= ); > > + gimple *mem_ref_stmt > > + =3D gimple_build_assign (mem_ref_lhs, mem_ref_op); > > + gsi_insert_before (f.gsi, mem_ref_stmt, GSI_SAME_STMT); > > + > > + int source_nelts =3D TYPE_VECTOR_SUBPARTS (access_type).to_consta= nt (); > > + vec_perm_builder sel (lhs_len, source_nelts, 1); > > + for (int i =3D 0; i < source_nelts; i++) > > + sel.quick_push (i); > > + > > + vec_perm_indices indices (sel, 1, source_nelts); > > + gcc_checking_assert (can_vec_perm_const_p (TYPE_MODE (lhs_type), > > + TYPE_MODE (access_type= ), > > + indices)); > > + tree mask_type =3D build_vector_type (ssizetype, lhs_len); > > + tree mask =3D vec_perm_indices_to_tree (mask_type, indices); > > + return gimple_build_assign (lhs, VEC_PERM_EXPR, > > + mem_ref_lhs, mem_ref_lhs, mask); > > + } > > + > > + return NULL; > > + } > > }; > > > > class svld1ro_impl : public load_replicate > > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64= .cc > > index d4c575ce976..bb24701b0d2 100644 > > --- a/gcc/config/aarch64/aarch64.cc > > +++ b/gcc/config/aarch64/aarch64.cc > > @@ -23395,8 +23395,10 @@ struct expand_vec_perm_d > > { > > rtx target, op0, op1; > > vec_perm_indices perm; > > + machine_mode op_mode; > > machine_mode vmode; > > unsigned int vec_flags; > > + unsigned int op_vec_flags; > > Very minor, but it would be good to keep the order consistent: > output mode first or input mode first. Guess it might as well > be output mode first, to match the hook: > > machine_mode vmode; > machine_mode op_mode; > unsigned int vec_flags; > unsigned int op_vec_flags; > > > bool one_vector_p; > > bool testing_p; > > }; > > @@ -23945,6 +23947,32 @@ aarch64_evpc_sve_tbl (struct expand_vec_perm_d= *d) > > return true; > > } > > > > +/* Try to implement D using SVE dup instruction. */ > > + > > +static bool > > +aarch64_evpc_sve_dup (struct expand_vec_perm_d *d) > > +{ > > + if (BYTES_BIG_ENDIAN > > + || !d->one_vector_p > > + || d->vec_flags !=3D VEC_SVE_DATA > > + || d->op_vec_flags !=3D VEC_ADVSIMD > > Sorry, one more: DUPQ only handles 128-bit AdvSIMD modes, so we also need= : > > || !known_eq (GET_MODE_BITSIZE (d->op_mode), 128) > > This isn't redundant with any of the other tests. > > (We can use DUP .D for 64-bit input vectors, but that's a separate patch.= ) > > OK with those changes (including using "else" :-)), thanks. Hi, The patch regressed vdup_n_3.c and vzip_{2,3,4}.c because aarch64_expand_vec_perm_const_1 was getting passed uninitialized values for d->op_mode and d->op_vec_flags when called from aarch64_evpc_reencode. The attached patch fixes the issue by setting newd.op_mode to newd.vmode and likewise for op_vec_flags. Does that look OK ? Bootstrap+test in progress on aarch64-linux-gnu. PS: How to bootstrap with SVE enabled ? Shall make BOOT_CFLAGS=3D"-mcpu=3Dgeneric+sve" be sufficient ? Currently I only tested the patch with normal bootstrap+test. Thanks, Prathamesh > > Richard > > > + || d->perm.encoding ().nelts_per_pattern () !=3D 1 > > + || !known_eq (d->perm.encoding ().npatterns (), > > + GET_MODE_NUNITS (d->op_mode))) > > + return false; > > + > > + int npatterns =3D d->perm.encoding ().npatterns (); > > + for (int i =3D 0; i < npatterns; i++) > > + if (!known_eq (d->perm[i], i)) > > + return false; > > + > > + if (d->testing_p) > > + return true; > > + > > + aarch64_expand_sve_dupq (d->target, GET_MODE (d->target), d->op0); > > + return true; > > +} > > + > > /* Try to implement D using SVE SEL instruction. */ > > > > static bool > > @@ -24084,30 +24112,39 @@ aarch64_expand_vec_perm_const_1 (struct expan= d_vec_perm_d *d) > > || d->vec_flags =3D=3D VEC_SVE_PRED) > > && known_gt (nelt, 1)) > > { > > - if (aarch64_evpc_rev_local (d)) > > - return true; > > - else if (aarch64_evpc_rev_global (d)) > > - return true; > > - else if (aarch64_evpc_ext (d)) > > - return true; > > - else if (aarch64_evpc_dup (d)) > > - return true; > > - else if (aarch64_evpc_zip (d)) > > - return true; > > - else if (aarch64_evpc_uzp (d)) > > - return true; > > - else if (aarch64_evpc_trn (d)) > > - return true; > > - else if (aarch64_evpc_sel (d)) > > - return true; > > - else if (aarch64_evpc_ins (d)) > > - return true; > > - else if (aarch64_evpc_reencode (d)) > > + /* If operand and result modes differ, then only check > > + for dup case. */ > > + if (d->vmode =3D=3D d->op_mode) > > + { > > + if (aarch64_evpc_rev_local (d)) > > + return true; > > + else if (aarch64_evpc_rev_global (d)) > > + return true; > > + else if (aarch64_evpc_ext (d)) > > + return true; > > + else if (aarch64_evpc_dup (d)) > > + return true; > > + else if (aarch64_evpc_zip (d)) > > + return true; > > + else if (aarch64_evpc_uzp (d)) > > + return true; > > + else if (aarch64_evpc_trn (d)) > > + return true; > > + else if (aarch64_evpc_sel (d)) > > + return true; > > + else if (aarch64_evpc_ins (d)) > > + return true; > > + else if (aarch64_evpc_reencode (d)) > > + return true; > > + > > + if (d->vec_flags =3D=3D VEC_SVE_DATA) > > + return aarch64_evpc_sve_tbl (d); > > + else if (d->vec_flags =3D=3D VEC_ADVSIMD) > > + return aarch64_evpc_tbl (d); > > + } > > + > > + if (aarch64_evpc_sve_dup (d)) > > return true; > > - if (d->vec_flags =3D=3D VEC_SVE_DATA) > > - return aarch64_evpc_sve_tbl (d); > > - else if (d->vec_flags =3D=3D VEC_ADVSIMD) > > - return aarch64_evpc_tbl (d); > > } > > return false; > > } > > @@ -24119,9 +24156,6 @@ aarch64_vectorize_vec_perm_const (machine_mode = vmode, machine_mode op_mode, > > rtx target, rtx op0, rtx op1, > > const vec_perm_indices &sel) > > { > > - if (vmode !=3D op_mode) > > - return false; > > - > > struct expand_vec_perm_d d; > > > > /* Check whether the mask can be applied to a single vector. */ > > @@ -24145,6 +24179,8 @@ aarch64_vectorize_vec_perm_const (machine_mode = vmode, machine_mode op_mode, > > sel.nelts_per_input ()); > > d.vmode =3D vmode; > > d.vec_flags =3D aarch64_classify_vector_mode (d.vmode); > > + d.op_mode =3D op_mode; > > + d.op_vec_flags =3D aarch64_classify_vector_mode (d.op_mode); > > d.target =3D target; > > d.op0 =3D op0 ? force_reg (vmode, op0) : NULL_RTX; > > if (op0 =3D=3D op1) --0000000000004eaa9705e0d9520e Content-Type: text/plain; charset="US-ASCII"; name="pr96463-13.txt" Content-Disposition: attachment; filename="pr96463-13.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_l441c2n40 ZGlmZiAtLWdpdCBhL2djYy9jb25maWcvYWFyY2g2NC9hYXJjaDY0LXN2ZS1idWlsdGlucy1iYXNl LmNjIGIvZ2NjL2NvbmZpZy9hYXJjaDY0L2FhcmNoNjQtc3ZlLWJ1aWx0aW5zLWJhc2UuY2MKaW5k ZXggYmVlNDEwOTI5YmQuLjFhODA0YjFhYjczIDEwMDY0NAotLS0gYS9nY2MvY29uZmlnL2FhcmNo NjQvYWFyY2g2NC1zdmUtYnVpbHRpbnMtYmFzZS5jYworKysgYi9nY2MvY29uZmlnL2FhcmNoNjQv YWFyY2g2NC1zdmUtYnVpbHRpbnMtYmFzZS5jYwpAQCAtNDQsNiArNDQsNyBAQAogI2luY2x1ZGUg ImFhcmNoNjQtc3ZlLWJ1aWx0aW5zLXNoYXBlcy5oIgogI2luY2x1ZGUgImFhcmNoNjQtc3ZlLWJ1 aWx0aW5zLWJhc2UuaCIKICNpbmNsdWRlICJhYXJjaDY0LXN2ZS1idWlsdGlucy1mdW5jdGlvbnMu aCIKKyNpbmNsdWRlICJzc2EuaCIKIAogdXNpbmcgbmFtZXNwYWNlIGFhcmNoNjRfc3ZlOwogCkBA IC0xMjA3LDYgKzEyMDgsNjQgQEAgcHVibGljOgogICAgIGluc25fY29kZSBpY29kZSA9IGNvZGVf Zm9yX2FhcmNoNjRfc3ZlX2xkMXJxIChlLnZlY3Rvcl9tb2RlICgwKSk7CiAgICAgcmV0dXJuIGUu dXNlX2NvbnRpZ3VvdXNfbG9hZF9pbnNuIChpY29kZSk7CiAgIH0KKworICBnaW1wbGUgKgorICBm b2xkIChnaW1wbGVfZm9sZGVyICZmKSBjb25zdCBvdmVycmlkZQorICB7CisgICAgdHJlZSBhcmcw ID0gZ2ltcGxlX2NhbGxfYXJnIChmLmNhbGwsIDApOworICAgIHRyZWUgYXJnMSA9IGdpbXBsZV9j YWxsX2FyZyAoZi5jYWxsLCAxKTsKKworICAgIC8qIFRyYW5zZm9ybToKKyAgICAgICBsaHMgPSBz dmxkMXJxICh7LTEsIC0xLCAuLi4gfSwgYXJnMSkKKyAgICAgICBpbnRvOgorICAgICAgIHRtcCA9 IG1lbV9yZWY8dmVjdHlwZT4gWyhpbnQgKiB7cmVmLWFsbH0pIGFyZzFdCisgICAgICAgbGhzID0g dmVjX3Blcm1fZXhwcjx0bXAsIHRtcCwgezAsIDEsIDIsIDMsIC4uLn0+LgorICAgICAgIG9uIGxp dHRsZSBlbmRpYW4gdGFyZ2V0LgorICAgICAgIHZlY3R5cGUgaXMgdGhlIGNvcnJlc3BvbmRpbmcg QURWU0lNRCB0eXBlLiAgKi8KKworICAgIGlmICghQllURVNfQklHX0VORElBTgorCSYmIGludGVn ZXJfYWxsX29uZXNwIChhcmcwKSkKKyAgICAgIHsKKwl0cmVlIGxocyA9IGdpbXBsZV9jYWxsX2xo cyAoZi5jYWxsKTsKKwl0cmVlIGxoc190eXBlID0gVFJFRV9UWVBFIChsaHMpOworCXBvbHlfdWlu dDY0IGxoc19sZW4gPSBUWVBFX1ZFQ1RPUl9TVUJQQVJUUyAobGhzX3R5cGUpOworCXRyZWUgZWx0 eXBlID0gVFJFRV9UWVBFIChsaHNfdHlwZSk7CisKKwlzY2FsYXJfbW9kZSBlbG1vZGUgPSBHRVRf TU9ERV9JTk5FUiAoVFlQRV9NT0RFIChsaHNfdHlwZSkpOworCW1hY2hpbmVfbW9kZSB2cV9tb2Rl ID0gYWFyY2g2NF92cV9tb2RlIChlbG1vZGUpLnJlcXVpcmUgKCk7CisJdHJlZSB2ZWN0eXBlID0g YnVpbGRfdmVjdG9yX3R5cGVfZm9yX21vZGUgKGVsdHlwZSwgdnFfbW9kZSk7CisKKwl0cmVlIGVs dF9wdHJfdHlwZQorCSAgPSBidWlsZF9wb2ludGVyX3R5cGVfZm9yX21vZGUgKGVsdHlwZSwgVk9J RG1vZGUsIHRydWUpOworCXRyZWUgemVybyA9IGJ1aWxkX3plcm9fY3N0IChlbHRfcHRyX3R5cGUp OworCisJLyogVXNlIGVsZW1lbnQgdHlwZSBhbGlnbm1lbnQuICAqLworCXRyZWUgYWNjZXNzX3R5 cGUKKwkgID0gYnVpbGRfYWxpZ25lZF90eXBlICh2ZWN0eXBlLCBUWVBFX0FMSUdOIChlbHR5cGUp KTsKKworCXRyZWUgbWVtX3JlZl9saHMgPSBtYWtlX3NzYV9uYW1lX2ZuIChjZnVuLCBhY2Nlc3Nf dHlwZSwgMCk7CisJdHJlZSBtZW1fcmVmX29wID0gZm9sZF9idWlsZDIgKE1FTV9SRUYsIGFjY2Vz c190eXBlLCBhcmcxLCB6ZXJvKTsKKwlnaW1wbGUgKm1lbV9yZWZfc3RtdAorCSAgPSBnaW1wbGVf YnVpbGRfYXNzaWduIChtZW1fcmVmX2xocywgbWVtX3JlZl9vcCk7CisJZ3NpX2luc2VydF9iZWZv cmUgKGYuZ3NpLCBtZW1fcmVmX3N0bXQsIEdTSV9TQU1FX1NUTVQpOworCisJaW50IHNvdXJjZV9u ZWx0cyA9IFRZUEVfVkVDVE9SX1NVQlBBUlRTIChhY2Nlc3NfdHlwZSkudG9fY29uc3RhbnQgKCk7 CisJdmVjX3Blcm1fYnVpbGRlciBzZWwgKGxoc19sZW4sIHNvdXJjZV9uZWx0cywgMSk7CisJZm9y IChpbnQgaSA9IDA7IGkgPCBzb3VyY2VfbmVsdHM7IGkrKykKKwkgIHNlbC5xdWlja19wdXNoIChp KTsKKworCXZlY19wZXJtX2luZGljZXMgaW5kaWNlcyAoc2VsLCAxLCBzb3VyY2VfbmVsdHMpOwor CWdjY19jaGVja2luZ19hc3NlcnQgKGNhbl92ZWNfcGVybV9jb25zdF9wIChUWVBFX01PREUgKGxo c190eXBlKSwKKwkJCQkJCSAgIFRZUEVfTU9ERSAoYWNjZXNzX3R5cGUpLAorCQkJCQkJICAgaW5k aWNlcykpOworCXRyZWUgbWFza190eXBlID0gYnVpbGRfdmVjdG9yX3R5cGUgKHNzaXpldHlwZSwg bGhzX2xlbik7CisJdHJlZSBtYXNrID0gdmVjX3Blcm1faW5kaWNlc190b190cmVlIChtYXNrX3R5 cGUsIGluZGljZXMpOworCXJldHVybiBnaW1wbGVfYnVpbGRfYXNzaWduIChsaHMsIFZFQ19QRVJN X0VYUFIsCisJCQkJICAgIG1lbV9yZWZfbGhzLCBtZW1fcmVmX2xocywgbWFzayk7CisgICAgICB9 CisKKyAgICByZXR1cm4gTlVMTDsKKyAgfQogfTsKIAogY2xhc3Mgc3ZsZDFyb19pbXBsIDogcHVi bGljIGxvYWRfcmVwbGljYXRlCmRpZmYgLS1naXQgYS9nY2MvY29uZmlnL2FhcmNoNjQvYWFyY2g2 NC5jYyBiL2djYy9jb25maWcvYWFyY2g2NC9hYXJjaDY0LmNjCmluZGV4IGQ0YzU3NWNlOTc2Li4z NzExNzQ1NjlmMCAxMDA2NDQKLS0tIGEvZ2NjL2NvbmZpZy9hYXJjaDY0L2FhcmNoNjQuY2MKKysr IGIvZ2NjL2NvbmZpZy9hYXJjaDY0L2FhcmNoNjQuY2MKQEAgLTIzMzk2LDcgKzIzMzk2LDkgQEAg c3RydWN0IGV4cGFuZF92ZWNfcGVybV9kCiAgIHJ0eCB0YXJnZXQsIG9wMCwgb3AxOwogICB2ZWNf cGVybV9pbmRpY2VzIHBlcm07CiAgIG1hY2hpbmVfbW9kZSB2bW9kZTsKKyAgbWFjaGluZV9tb2Rl IG9wX21vZGU7CiAgIHVuc2lnbmVkIGludCB2ZWNfZmxhZ3M7CisgIHVuc2lnbmVkIGludCBvcF92 ZWNfZmxhZ3M7CiAgIGJvb2wgb25lX3ZlY3Rvcl9wOwogICBib29sIHRlc3RpbmdfcDsKIH07CkBA IC0yMzYzMSw2ICsyMzYzMyw4IEBAIGFhcmNoNjRfZXZwY19yZWVuY29kZSAoc3RydWN0IGV4cGFu ZF92ZWNfcGVybV9kICpkKQogCiAgIG5ld2Qudm1vZGUgPSBuZXdfbW9kZTsKICAgbmV3ZC52ZWNf ZmxhZ3MgPSBWRUNfQURWU0lNRDsKKyAgbmV3ZC5vcF9tb2RlID0gbmV3ZC52bW9kZTsKKyAgbmV3 ZC5vcF92ZWNfZmxhZ3MgPSBuZXdkLnZlY19mbGFnczsKICAgbmV3ZC50YXJnZXQgPSBkLT50YXJn ZXQgPyBnZW5fbG93cGFydCAobmV3X21vZGUsIGQtPnRhcmdldCkgOiBOVUxMOwogICBuZXdkLm9w MCA9IGQtPm9wMCA/IGdlbl9sb3dwYXJ0IChuZXdfbW9kZSwgZC0+b3AwKSA6IE5VTEw7CiAgIG5l d2Qub3AxID0gZC0+b3AxID8gZ2VuX2xvd3BhcnQgKG5ld19tb2RlLCBkLT5vcDEpIDogTlVMTDsK QEAgLTIzOTQ1LDYgKzIzOTQ5LDMzIEBAIGFhcmNoNjRfZXZwY19zdmVfdGJsIChzdHJ1Y3QgZXhw YW5kX3ZlY19wZXJtX2QgKmQpCiAgIHJldHVybiB0cnVlOwogfQogCisvKiBUcnkgdG8gaW1wbGVt ZW50IEQgdXNpbmcgU1ZFIGR1cCBpbnN0cnVjdGlvbi4gICovCisKK3N0YXRpYyBib29sCithYXJj aDY0X2V2cGNfc3ZlX2R1cCAoc3RydWN0IGV4cGFuZF92ZWNfcGVybV9kICpkKQoreworICBpZiAo QllURVNfQklHX0VORElBTgorICAgICAgfHwgIWQtPm9uZV92ZWN0b3JfcAorICAgICAgfHwgZC0+ dmVjX2ZsYWdzICE9IFZFQ19TVkVfREFUQQorICAgICAgfHwgZC0+b3BfdmVjX2ZsYWdzICE9IFZF Q19BRFZTSU1ECisgICAgICB8fCBkLT5wZXJtLmVuY29kaW5nICgpLm5lbHRzX3Blcl9wYXR0ZXJu ICgpICE9IDEKKyAgICAgIHx8ICFrbm93bl9lcSAoZC0+cGVybS5lbmNvZGluZyAoKS5ucGF0dGVy bnMgKCksCisJCSAgICBHRVRfTU9ERV9OVU5JVFMgKGQtPm9wX21vZGUpKQorICAgICAgfHwgIWtu b3duX2VxIChHRVRfTU9ERV9CSVRTSVpFIChkLT5vcF9tb2RlKSwgMTI4KSkKKyAgICByZXR1cm4g ZmFsc2U7CisKKyAgaW50IG5wYXR0ZXJucyA9IGQtPnBlcm0uZW5jb2RpbmcgKCkubnBhdHRlcm5z ICgpOworICBmb3IgKGludCBpID0gMDsgaSA8IG5wYXR0ZXJuczsgaSsrKQorICAgIGlmICgha25v d25fZXEgKGQtPnBlcm1baV0sIGkpKQorICAgICAgcmV0dXJuIGZhbHNlOworCisgIGlmIChkLT50 ZXN0aW5nX3ApCisgICAgcmV0dXJuIHRydWU7CisKKyAgYWFyY2g2NF9leHBhbmRfc3ZlX2R1cHEg KGQtPnRhcmdldCwgR0VUX01PREUgKGQtPnRhcmdldCksIGQtPm9wMCk7CisgIHJldHVybiB0cnVl OworfQorCiAvKiBUcnkgdG8gaW1wbGVtZW50IEQgdXNpbmcgU1ZFIFNFTCBpbnN0cnVjdGlvbi4g ICovCiAKIHN0YXRpYyBib29sCkBAIC0yNDA2OCw2ICsyNDA5OSw4IEBAIGFhcmNoNjRfZXZwY19p bnMgKHN0cnVjdCBleHBhbmRfdmVjX3Blcm1fZCAqZCkKIHN0YXRpYyBib29sCiBhYXJjaDY0X2V4 cGFuZF92ZWNfcGVybV9jb25zdF8xIChzdHJ1Y3QgZXhwYW5kX3ZlY19wZXJtX2QgKmQpCiB7Cisg IGdjY19hc3NlcnQgKGQtPm9wX21vZGUgIT0gRV9WT0lEbW9kZSk7CisKICAgLyogVGhlIHBhdHRl cm4gbWF0Y2hpbmcgZnVuY3Rpb25zIGFib3ZlIGFyZSB3cml0dGVuIHRvIGxvb2sgZm9yIGEgc21h bGwKICAgICAgbnVtYmVyIHRvIGJlZ2luIHRoZSBzZXF1ZW5jZSAoMCwgMSwgTi8yKS4gIElmIHdl IGJlZ2luIHdpdGggYW4gaW5kZXgKICAgICAgZnJvbSB0aGUgc2Vjb25kIG9wZXJhbmQsIHdlIGNh biBzd2FwIHRoZSBvcGVyYW5kcy4gICovCkBAIC0yNDA4NCwzMCArMjQxMTcsMzkgQEAgYWFyY2g2 NF9leHBhbmRfdmVjX3Blcm1fY29uc3RfMSAoc3RydWN0IGV4cGFuZF92ZWNfcGVybV9kICpkKQog ICAgICAgIHx8IGQtPnZlY19mbGFncyA9PSBWRUNfU1ZFX1BSRUQpCiAgICAgICAmJiBrbm93bl9n dCAobmVsdCwgMSkpCiAgICAgewotICAgICAgaWYgKGFhcmNoNjRfZXZwY19yZXZfbG9jYWwgKGQp KQotCXJldHVybiB0cnVlOwotICAgICAgZWxzZSBpZiAoYWFyY2g2NF9ldnBjX3Jldl9nbG9iYWwg KGQpKQotCXJldHVybiB0cnVlOwotICAgICAgZWxzZSBpZiAoYWFyY2g2NF9ldnBjX2V4dCAoZCkp Ci0JcmV0dXJuIHRydWU7Ci0gICAgICBlbHNlIGlmIChhYXJjaDY0X2V2cGNfZHVwIChkKSkKLQly ZXR1cm4gdHJ1ZTsKLSAgICAgIGVsc2UgaWYgKGFhcmNoNjRfZXZwY196aXAgKGQpKQotCXJldHVy biB0cnVlOwotICAgICAgZWxzZSBpZiAoYWFyY2g2NF9ldnBjX3V6cCAoZCkpCi0JcmV0dXJuIHRy dWU7Ci0gICAgICBlbHNlIGlmIChhYXJjaDY0X2V2cGNfdHJuIChkKSkKLQlyZXR1cm4gdHJ1ZTsK LSAgICAgIGVsc2UgaWYgKGFhcmNoNjRfZXZwY19zZWwgKGQpKQotCXJldHVybiB0cnVlOwotICAg ICAgZWxzZSBpZiAoYWFyY2g2NF9ldnBjX2lucyAoZCkpCi0JcmV0dXJuIHRydWU7Ci0gICAgICBl bHNlIGlmIChhYXJjaDY0X2V2cGNfcmVlbmNvZGUgKGQpKQotCXJldHVybiB0cnVlOwotICAgICAg aWYgKGQtPnZlY19mbGFncyA9PSBWRUNfU1ZFX0RBVEEpCi0JcmV0dXJuIGFhcmNoNjRfZXZwY19z dmVfdGJsIChkKTsKLSAgICAgIGVsc2UgaWYgKGQtPnZlY19mbGFncyA9PSBWRUNfQURWU0lNRCkK LQlyZXR1cm4gYWFyY2g2NF9ldnBjX3RibCAoZCk7CisgICAgICBpZiAoZC0+dm1vZGUgPT0gZC0+ b3BfbW9kZSkKKwl7CisJICBpZiAoYWFyY2g2NF9ldnBjX3Jldl9sb2NhbCAoZCkpCisJICAgIHJl dHVybiB0cnVlOworCSAgZWxzZSBpZiAoYWFyY2g2NF9ldnBjX3Jldl9nbG9iYWwgKGQpKQorCSAg ICByZXR1cm4gdHJ1ZTsKKwkgIGVsc2UgaWYgKGFhcmNoNjRfZXZwY19leHQgKGQpKQorCSAgICBy ZXR1cm4gdHJ1ZTsKKwkgIGVsc2UgaWYgKGFhcmNoNjRfZXZwY19kdXAgKGQpKQorCSAgICByZXR1 cm4gdHJ1ZTsKKwkgIGVsc2UgaWYgKGFhcmNoNjRfZXZwY196aXAgKGQpKQorCSAgICByZXR1cm4g dHJ1ZTsKKwkgIGVsc2UgaWYgKGFhcmNoNjRfZXZwY191enAgKGQpKQorCSAgICByZXR1cm4gdHJ1 ZTsKKwkgIGVsc2UgaWYgKGFhcmNoNjRfZXZwY190cm4gKGQpKQorCSAgICByZXR1cm4gdHJ1ZTsK KwkgIGVsc2UgaWYgKGFhcmNoNjRfZXZwY19zZWwgKGQpKQorCSAgICByZXR1cm4gdHJ1ZTsKKwkg IGVsc2UgaWYgKGFhcmNoNjRfZXZwY19pbnMgKGQpKQorCSAgICByZXR1cm4gdHJ1ZTsKKwkgIGVs c2UgaWYgKGFhcmNoNjRfZXZwY19yZWVuY29kZSAoZCkpCisJICAgIHJldHVybiB0cnVlOworCisJ ICBpZiAoZC0+dmVjX2ZsYWdzID09IFZFQ19TVkVfREFUQSkKKwkgICAgcmV0dXJuIGFhcmNoNjRf ZXZwY19zdmVfdGJsIChkKTsKKwkgIGVsc2UgaWYgKGQtPnZlY19mbGFncyA9PSBWRUNfQURWU0lN RCkKKwkgICAgcmV0dXJuIGFhcmNoNjRfZXZwY190YmwgKGQpOworCX0KKyAgICAgIGVsc2UKKwl7 CisJICBpZiAoYWFyY2g2NF9ldnBjX3N2ZV9kdXAgKGQpKQorCSAgICByZXR1cm4gdHJ1ZTsKKwl9 CiAgICAgfQogICByZXR1cm4gZmFsc2U7CiB9CkBAIC0yNDExOSw5ICsyNDE2MSw2IEBAIGFhcmNo NjRfdmVjdG9yaXplX3ZlY19wZXJtX2NvbnN0IChtYWNoaW5lX21vZGUgdm1vZGUsIG1hY2hpbmVf bW9kZSBvcF9tb2RlLAogCQkJCSAgcnR4IHRhcmdldCwgcnR4IG9wMCwgcnR4IG9wMSwKIAkJCQkg IGNvbnN0IHZlY19wZXJtX2luZGljZXMgJnNlbCkKIHsKLSAgaWYgKHZtb2RlICE9IG9wX21vZGUp Ci0gICAgcmV0dXJuIGZhbHNlOwotCiAgIHN0cnVjdCBleHBhbmRfdmVjX3Blcm1fZCBkOwogCiAg IC8qIENoZWNrIHdoZXRoZXIgdGhlIG1hc2sgY2FuIGJlIGFwcGxpZWQgdG8gYSBzaW5nbGUgdmVj dG9yLiAgKi8KQEAgLTI0MTQ1LDYgKzI0MTg0LDggQEAgYWFyY2g2NF92ZWN0b3JpemVfdmVjX3Bl cm1fY29uc3QgKG1hY2hpbmVfbW9kZSB2bW9kZSwgbWFjaGluZV9tb2RlIG9wX21vZGUsCiAJCSAg ICAgc2VsLm5lbHRzX3Blcl9pbnB1dCAoKSk7CiAgIGQudm1vZGUgPSB2bW9kZTsKICAgZC52ZWNf ZmxhZ3MgPSBhYXJjaDY0X2NsYXNzaWZ5X3ZlY3Rvcl9tb2RlIChkLnZtb2RlKTsKKyAgZC5vcF9t b2RlID0gb3BfbW9kZTsKKyAgZC5vcF92ZWNfZmxhZ3MgPSBhYXJjaDY0X2NsYXNzaWZ5X3ZlY3Rv cl9tb2RlIChkLm9wX21vZGUpOwogICBkLnRhcmdldCA9IHRhcmdldDsKICAgZC5vcDAgPSBvcDAg PyBmb3JjZV9yZWcgKHZtb2RlLCBvcDApIDogTlVMTF9SVFg7CiAgIGlmIChvcDAgPT0gb3AxKQo= --0000000000004eaa9705e0d9520e--