From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by sourceware.org (Postfix) with ESMTPS id 57BCF3858D3C for ; Mon, 27 Dec 2021 10:25:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 57BCF3858D3C Received: by mail-ed1-x52b.google.com with SMTP id q14so52297700edi.3 for ; Mon, 27 Dec 2021 02:25:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=rM7w66Lsk9jHjhQoTHVCTpYKW3i9UxaxthnjEJvyLbE=; b=sAvr22wAh9QI7i1OADohVwLY7ytRe+vQXfwu7E8vnI+VuzhbZd3IdkQFkvWJqx0ATj POPPVlWgzz0ztFwDHbguGoY9H3MH1InmfYTJ/jPFYTNLVS40qUmIo0KaEnDSflzcK8Bu zsWRocJn5jdTDKBNI2hC5NRBgGWpDp0iyeSte3ae4xuPFG1L5K0VD8E+JQx3wl0zT4CP paOLFPBgsktVaauYxaBYeOd2TGyNPah2oAtKyAMxDAshrmDIqdqeeWcYAkYkTGvbbVj1 BPTjxQiTJIjpDhiDXJSjX+joeZeuahS2vLs86urNzFWhDdgW1Quxki9h2oyR6ZhCuOek 3L/Q== X-Gm-Message-State: AOAM532PCZPd8QsMzKrgkV/zs+FUlDIu1El+AIyk6h3rMae7n6t8QB8I LYA7mH13rXEEw9GsMAv3Q0gJaqMmpMsPczsLQW0FDQ== X-Google-Smtp-Source: ABdhPJzyBiT9u+E722XN23ZYkxWQ45nyX/nAgX6dW/Ea+RrpIIVY68zwU+C5D4Q3yRwmvK60RDQCtAUXOlR/djgkBJ4= X-Received: by 2002:a17:907:c10:: with SMTP id ga16mr13462209ejc.502.1640600723113; Mon, 27 Dec 2021 02:25:23 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Prathamesh Kulkarni Date: Mon, 27 Dec 2021 15:54:49 +0530 Message-ID: Subject: Re: [1/2] PR96463 - aarch64 specific changes To: Prathamesh Kulkarni , gcc Patches , richard.sandiford@arm.com Content-Type: multipart/mixed; boundary="00000000000052213105d41e1f05" X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Dec 2021 10:25:27 -0000 --00000000000052213105d41e1f05 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 17 Dec 2021 at 17:03, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > Hi, > > The patch folds: > > lhs =3D svld1rq ({-1, -1, -1, ...}, &v[0]) > > into: > > lhs =3D vec_perm_expr > > and expands above vec_perm_expr using aarch64_expand_sve_dupq. > > > > With patch, for following test: > > #include > > #include > > > > svint32_t > > foo (int32x4_t x) > > { > > return svld1rq (svptrue_b8 (), &x[0]); > > } > > > > it generates following code: > > foo: > > .LFB4350: > > dup z0.q, z0.q[0] > > ret > > > > and passes bootstrap+test on aarch64-linux-gnu. > > But I am not sure if the changes to aarch64_evpc_sve_tbl > > are correct. > > Just in case: I was only using int32x4_t in the PR as an example. > The same thing should work for all element types. > > > > > Thanks, > > Prathamesh > > > > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/conf= ig/aarch64/aarch64-sve-builtins-base.cc > > index 02e42a71e5e..e21bbec360c 100644 > > --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc > > +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc > > @@ -1207,6 +1207,56 @@ public: > > insn_code icode =3D code_for_aarch64_sve_ld1rq (e.vector_mode (0))= ; > > return e.use_contiguous_load_insn (icode); > > } > > + > > + gimple * > > + fold (gimple_folder &f) const OVERRIDE > > + { > > + tree arg0 =3D gimple_call_arg (f.call, 0); > > + tree arg1 =3D gimple_call_arg (f.call, 1); > > + > > + /* Transform: > > + lhs =3D svld1rq ({-1, -1, ... }, &v[0]) > > + into: > > + lhs =3D vec_perm_expr. > > + on little endian target. */ > > + > > + if (!BYTES_BIG_ENDIAN > > + && integer_all_onesp (arg0) > > + && TREE_CODE (arg1) =3D=3D ADDR_EXPR) > > + { > > + tree t =3D TREE_OPERAND (arg1, 0); > > + if (TREE_CODE (t) =3D=3D ARRAY_REF) > > + { > > + tree index =3D TREE_OPERAND (t, 1); > > + t =3D TREE_OPERAND (t, 0); > > + if (integer_zerop (index) && TREE_CODE (t) =3D=3D VIEW_CONVER= T_EXPR) > > + { > > + t =3D TREE_OPERAND (t, 0); > > + tree vectype =3D TREE_TYPE (t); > > + if (VECTOR_TYPE_P (vectype) > > + && known_eq (TYPE_VECTOR_SUBPARTS (vectype), 4u) > > + && wi::to_wide (TYPE_SIZE (vectype)) =3D=3D 128) > > + { > > Since this is quite a specific pattern match, and since we now lower > arm_neon.h vld1* to normal gimple accesses, I think we should try the > =E2=80=9Cmore generally=E2=80=9D approach mentioned in the PR and see wha= t the fallout > is. That is, keep: > > if (!BYTES_BIG_ENDIAN > && integer_all_onesp (arg0) > > If those conditions pass, create an Advanced SIMD access at address arg1, > using similar code to the handling of: > > BUILTIN_VALL_F16 (LOAD1, ld1, 0, LOAD) > BUILTIN_VDQ_I (LOAD1_U, ld1, 0, LOAD) > BUILTIN_VALLP_NO_DI (LOAD1_P, ld1, 0, LOAD) > > in aarch64_general_gimple_fold_builtin. (Would be good to move the > common code to aarch64.c so that both files can use it.) > > > + tree lhs =3D gimple_call_lhs (f.call); > > + tree lhs_type =3D TREE_TYPE (lhs); > > + int source_nelts =3D TYPE_VECTOR_SUBPARTS (vectype).t= o_constant (); > > + vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (lhs_type)= , source_nelts, 1); > > + for (int i =3D 0; i < source_nelts; i++) > > + sel.quick_push (i); > > + > > + vec_perm_indices indices (sel, 1, source_nelts); > > + if (!can_vec_perm_const_p (TYPE_MODE (lhs_type), indi= ces)) > > + return NULL; > > I don't think we need to check this: it should always be true. > Probably worth keeping as a gcc_checking_assert though. > > > + > > + tree mask =3D vec_perm_indices_to_tree (lhs_type, ind= ices); > > + return gimple_build_assign (lhs, VEC_PERM_EXPR, t, t,= mask); > > + } > > + } > > + } > > + } > > + > > + return NULL; > > + } > > }; > > > > class svld1ro_impl : public load_replicate > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.= c > > index f07330cff4f..af27f550be3 100644 > > --- a/gcc/config/aarch64/aarch64.c > > +++ b/gcc/config/aarch64/aarch64.c > > @@ -23002,8 +23002,32 @@ aarch64_evpc_sve_tbl (struct expand_vec_perm_d= *d) > > > > machine_mode sel_mode =3D related_int_vector_mode (d->vmode).require= (); > > rtx sel =3D vec_perm_indices_to_rtx (sel_mode, d->perm); > > + > > if (d->one_vector_p) > > - emit_unspec2 (d->target, UNSPEC_TBL, d->op0, force_reg (sel_mode, = sel)); > > + { > > + bool use_dupq =3D false; > > + /* Check if sel is dup vector with encoded elements {0, 1, 2, ..= . nelts} */ > > + if (GET_CODE (sel) =3D=3D CONST_VECTOR > > + && !GET_MODE_NUNITS (GET_MODE (sel)).is_constant () > > + && CONST_VECTOR_DUPLICATE_P (sel)) > > + { > > + unsigned nelts =3D const_vector_encoded_nelts (sel); > > + unsigned i; > > + for (i =3D 0; i < nelts; i++) > > + { > > + rtx elem =3D CONST_VECTOR_ENCODED_ELT(sel, i); > > + if (!(CONST_INT_P (elem) && INTVAL(elem) =3D=3D i)) > > + break; > > + } > > + if (i =3D=3D nelts) > > + use_dupq =3D true; > > + } > > + > > + if (use_dupq) > > + aarch64_expand_sve_dupq (d->target, GET_MODE (d->target), d->op0)= ; > > + else > > + emit_unspec2 (d->target, UNSPEC_TBL, d->op0, force_reg (sel_mode,= sel)); > > + } > > This shouldn't be a TBL but a new operation, handled by its own > aarch64_evpc_sve_* routine. The check for the mask should then > be done on d->perm, to detect whether the permutation is one > that the new routine supports. > > I think the requirements are: > > - !BYTES_BIG_ENDIAN > - the source must be an Advanced SIMD vector > - the destination must be an SVE vector > - the permutation must be a duplicate (tested in the code above) > - the number of =E2=80=9Cpatterns=E2=80=9D in the permutation must equal = the number of > source elements > - element X of the permutation must equal X (tested in the code above) > > The existing aarch64_evpc_* routines expect the source and target modes > to be the same, so we should only call them when that's true. Hi Richard, Thanks for the suggestions, and sorry for late reply. Does the following patch look OK (sans the refactoring of building mem_ref)= ? Passes bootstrap+test on aarch64-linux-gnu. Thanks, Prathamesh > > Thanks, > Richard --00000000000052213105d41e1f05 Content-Type: text/plain; charset="US-ASCII"; name="pr96463-4-aarch64.txt" Content-Disposition: attachment; filename="pr96463-4-aarch64.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_kxoj3e0n0 ZGlmZiAtLWdpdCBhL2djYy9jb25maWcvYWFyY2g2NC9hYXJjaDY0LWJ1aWx0aW5zLmMgYi9nY2Mv Y29uZmlnL2FhcmNoNjQvYWFyY2g2NC1idWlsdGlucy5jCmluZGV4IDBkMDlmZTlkZDZkLi42NTZk MzlhNzQxYyAxMDA2NDQKLS0tIGEvZ2NjL2NvbmZpZy9hYXJjaDY0L2FhcmNoNjQtYnVpbHRpbnMu YworKysgYi9nY2MvY29uZmlnL2FhcmNoNjQvYWFyY2g2NC1idWlsdGlucy5jCkBAIC00Nyw2ICs0 Nyw3IEBACiAjaW5jbHVkZSAic3RyaW5ncG9vbC5oIgogI2luY2x1ZGUgImF0dHJpYnMuaCIKICNp bmNsdWRlICJnaW1wbGUtZm9sZC5oIgorI2luY2x1ZGUgImFhcmNoNjQtYnVpbHRpbnMuaCIKIAog I2RlZmluZSB2OHFpX1VQICBFX1Y4UUltb2RlCiAjZGVmaW5lIHY4ZGlfVVAgIEVfVjhESW1vZGUK QEAgLTEyOCw0NiArMTI5LDYgQEAKIAogI2RlZmluZSBTSU1EX01BWF9CVUlMVElOX0FSR1MgNQog Ci1lbnVtIGFhcmNoNjRfdHlwZV9xdWFsaWZpZXJzCi17Ci0gIC8qIFQgZm9vLiAgKi8KLSAgcXVh bGlmaWVyX25vbmUgPSAweDAsCi0gIC8qIHVuc2lnbmVkIFQgZm9vLiAgKi8KLSAgcXVhbGlmaWVy X3Vuc2lnbmVkID0gMHgxLCAvKiAxIDw8IDAgICovCi0gIC8qIGNvbnN0IFQgZm9vLiAgKi8KLSAg cXVhbGlmaWVyX2NvbnN0ID0gMHgyLCAvKiAxIDw8IDEgICovCi0gIC8qIFQgKmZvby4gICovCi0g IHF1YWxpZmllcl9wb2ludGVyID0gMHg0LCAvKiAxIDw8IDIgICovCi0gIC8qIFVzZWQgd2hlbiBl eHBhbmRpbmcgYXJndW1lbnRzIGlmIGFuIG9wZXJhbmQgY291bGQKLSAgICAgYmUgYW4gaW1tZWRp YXRlLiAgKi8KLSAgcXVhbGlmaWVyX2ltbWVkaWF0ZSA9IDB4OCwgLyogMSA8PCAzICAqLwotICBx dWFsaWZpZXJfbWF5YmVfaW1tZWRpYXRlID0gMHgxMCwgLyogMSA8PCA0ICAqLwotICAvKiB2b2lk IGZvbyAoLi4uKS4gICovCi0gIHF1YWxpZmllcl92b2lkID0gMHgyMCwgLyogMSA8PCA1ICAqLwot ICAvKiBTb21lIHBhdHRlcm5zIG1heSBoYXZlIGludGVybmFsIG9wZXJhbmRzLCB0aGlzIHF1YWxp ZmllciBpcyBhbgotICAgICBpbnN0cnVjdGlvbiB0byB0aGUgaW5pdGlhbGlzYXRpb24gY29kZSB0 byBza2lwIHRoaXMgb3BlcmFuZC4gICovCi0gIHF1YWxpZmllcl9pbnRlcm5hbCA9IDB4NDAsIC8q IDEgPDwgNiAgKi8KLSAgLyogU29tZSBidWlsdGlucyBzaG91bGQgdXNlIHRoZSBUXyptb2RlKiBl bmNvZGVkIGluIGEgc2ltZF9idWlsdGluX2RhdHVtCi0gICAgIHJhdGhlciB0aGFuIHVzaW5nIHRo ZSB0eXBlIG9mIHRoZSBvcGVyYW5kLiAgKi8KLSAgcXVhbGlmaWVyX21hcF9tb2RlID0gMHg4MCwg LyogMSA8PCA3ICAqLwotICAvKiBxdWFsaWZpZXJfcG9pbnRlciB8IHF1YWxpZmllcl9tYXBfbW9k ZSAgKi8KLSAgcXVhbGlmaWVyX3BvaW50ZXJfbWFwX21vZGUgPSAweDg0LAotICAvKiBxdWFsaWZp ZXJfY29uc3QgfCBxdWFsaWZpZXJfcG9pbnRlciB8IHF1YWxpZmllcl9tYXBfbW9kZSAgKi8KLSAg cXVhbGlmaWVyX2NvbnN0X3BvaW50ZXJfbWFwX21vZGUgPSAweDg2LAotICAvKiBQb2x5bm9taWFs IHR5cGVzLiAgKi8KLSAgcXVhbGlmaWVyX3BvbHkgPSAweDEwMCwKLSAgLyogTGFuZSBpbmRpY2Vz IC0gbXVzdCBiZSBpbiByYW5nZSwgYW5kIGZsaXBwZWQgZm9yIGJpZ2VuZGlhbi4gICovCi0gIHF1 YWxpZmllcl9sYW5lX2luZGV4ID0gMHgyMDAsCi0gIC8qIExhbmUgaW5kaWNlcyBmb3Igc2luZ2xl IGxhbmUgc3RydWN0dXJlIGxvYWRzIGFuZCBzdG9yZXMuICAqLwotICBxdWFsaWZpZXJfc3RydWN0 X2xvYWRfc3RvcmVfbGFuZV9pbmRleCA9IDB4NDAwLAotICAvKiBMYW5lIGluZGljZXMgc2VsZWN0 ZWQgaW4gcGFpcnMuIC0gbXVzdCBiZSBpbiByYW5nZSwgYW5kIGZsaXBwZWQgZm9yCi0gICAgIGJp Z2VuZGlhbi4gICovCi0gIHF1YWxpZmllcl9sYW5lX3BhaXJfaW5kZXggPSAweDgwMCwKLSAgLyog TGFuZSBpbmRpY2VzIHNlbGVjdGVkIGluIHF1YWR0dXBsZXRzLiAtIG11c3QgYmUgaW4gcmFuZ2Us IGFuZCBmbGlwcGVkIGZvcgotICAgICBiaWdlbmRpYW4uICAqLwotICBxdWFsaWZpZXJfbGFuZV9x dWFkdHVwX2luZGV4ID0gMHgxMDAwLAotfTsKLQogLyogRmxhZ3MgdGhhdCBkZXNjcmliZSB3aGF0 IGEgZnVuY3Rpb24gbWlnaHQgZG8uICAqLwogY29uc3QgdW5zaWduZWQgaW50IEZMQUdfTk9ORSA9 IDBVOwogY29uc3QgdW5zaWduZWQgaW50IEZMQUdfUkVBRF9GUENSID0gMVUgPDwgMDsKQEAgLTY3 MSw0NCArNjMyLDYgQEAgY29uc3QgY2hhciAqYWFyY2g2NF9zY2FsYXJfYnVpbHRpbl90eXBlc1td ID0gewogICBOVUxMCiB9OwogCi0jZGVmaW5lIEVOVFJZKEUsIE0sIFEsIEcpIEUsCi1lbnVtIGFh cmNoNjRfc2ltZF90eXBlCi17Ci0jaW5jbHVkZSAiYWFyY2g2NC1zaW1kLWJ1aWx0aW4tdHlwZXMu ZGVmIgotICBBUk1fTkVPTl9IX1RZUEVTX0xBU1QKLX07Ci0jdW5kZWYgRU5UUlkKLQotc3RydWN0 IEdUWSgoKSkgYWFyY2g2NF9zaW1kX3R5cGVfaW5mbwotewotICBlbnVtIGFhcmNoNjRfc2ltZF90 eXBlIHR5cGU7Ci0KLSAgLyogSW50ZXJuYWwgdHlwZSBuYW1lLiAgKi8KLSAgY29uc3QgY2hhciAq bmFtZTsKLQotICAvKiBJbnRlcm5hbCB0eXBlIG5hbWUobWFuZ2xlZCkuICBUaGUgbWFuZ2xlZCBu YW1lcyBjb25mb3JtIHRvIHRoZQotICAgICBBQVBDUzY0IChzZWUgIlByb2NlZHVyZSBDYWxsIFN0 YW5kYXJkIGZvciB0aGUgQVJNIDY0LWJpdCBBcmNoaXRlY3R1cmUiLAotICAgICBBcHBlbmRpeCBB KS4gIFRvIHF1YWxpZnkgZm9yIGVtaXNzaW9uIHdpdGggdGhlIG1hbmdsZWQgbmFtZXMgZGVmaW5l ZCBpbgotICAgICB0aGF0IGRvY3VtZW50LCBhIHZlY3RvciB0eXBlIG11c3Qgbm90IG9ubHkgYmUg b2YgdGhlIGNvcnJlY3QgbW9kZSBidXQgYWxzbwotICAgICBiZSBvZiB0aGUgY29ycmVjdCBpbnRl cm5hbCBBZHZTSU1EIHZlY3RvciB0eXBlIChlLmcuIF9fSW50OHg4X3QpOyB0aGVzZQotICAgICB0 eXBlcyBhcmUgcmVnaXN0ZXJlZCBieSBhYXJjaDY0X2luaXRfc2ltZF9idWlsdGluX3R5cGVzICgp LiAgSW4gb3RoZXIKLSAgICAgd29yZHMsIHZlY3RvciB0eXBlcyBkZWZpbmVkIGluIG90aGVyIHdh eXMgZS5nLiB2aWEgdmVjdG9yX3NpemUgYXR0cmlidXRlCi0gICAgIHdpbGwgZ2V0IGRlZmF1bHQg bWFuZ2xlZCBuYW1lcy4gICovCi0gIGNvbnN0IGNoYXIgKm1hbmdsZTsKLQotICAvKiBJbnRlcm5h bCB0eXBlLiAgKi8KLSAgdHJlZSBpdHlwZTsKLQotICAvKiBFbGVtZW50IHR5cGUuICAqLwotICB0 cmVlIGVsdHlwZTsKLQotICAvKiBNYWNoaW5lIG1vZGUgdGhlIGludGVybmFsIHR5cGUgbWFwcyB0 by4gICovCi0gIGVudW0gbWFjaGluZV9tb2RlIG1vZGU7Ci0KLSAgLyogUXVhbGlmaWVycy4gICov Ci0gIGVudW0gYWFyY2g2NF90eXBlX3F1YWxpZmllcnMgcTsKLX07Ci0KICNkZWZpbmUgRU5UUlko RSwgTSwgUSwgRykgIFwKICAge0UsICJfXyIgI0UsICNHICJfXyIgI0UsIE5VTExfVFJFRSwgTlVM TF9UUkVFLCBFXyMjTSMjbW9kZSwgcXVhbGlmaWVyXyMjUX0sCiBzdGF0aWMgR1RZKCgpKSBzdHJ1 Y3QgYWFyY2g2NF9zaW1kX3R5cGVfaW5mbyBhYXJjaDY0X3NpbWRfdHlwZXMgW10gPSB7CkBAIC0y Nzk2LDYgKzI3MTksMTQgQEAgZ2V0X21lbV90eXBlX2Zvcl9sb2FkX3N0b3JlICh1bnNpZ25lZCBp bnQgZmNvZGUpCiAgIH0KIH0KIAorLyogUmV0dXJuIGFhcmNoNjRfc2ltZF90eXBlX2luZm8gY29y cmVzcG9uZGluZyB0byBUWVBFLiAgKi8KKworYWFyY2g2NF9zaW1kX3R5cGVfaW5mbworYWFyY2g2 NF9nZXRfc2ltZF9pbmZvX2Zvcl90eXBlIChlbnVtIGFhcmNoNjRfc2ltZF90eXBlIHR5cGUpCit7 CisgIHJldHVybiBhYXJjaDY0X3NpbWRfdHlwZXNbdHlwZV07Cit9CisKIC8qIFRyeSB0byBmb2xk IFNUTVQsIGdpdmVuIHRoYXQgaXQncyBhIGNhbGwgdG8gdGhlIGJ1aWx0LWluIGZ1bmN0aW9uIHdp dGgKICAgIHN1YmNvZGUgRkNPREUuICBSZXR1cm4gdGhlIG5ldyBzdGF0ZW1lbnQgb24gc3VjY2Vz cyBhbmQgbnVsbCBvbgogICAgZmFpbHVyZS4gICovCmRpZmYgLS1naXQgYS9nY2MvY29uZmlnL2Fh cmNoNjQvYWFyY2g2NC1idWlsdGlucy5oIGIvZ2NjL2NvbmZpZy9hYXJjaDY0L2FhcmNoNjQtYnVp bHRpbnMuaApuZXcgZmlsZSBtb2RlIDEwMDY0NAppbmRleCAwMDAwMDAwMDAwMC4uYjM5NTQwMjM3 OWMKLS0tIC9kZXYvbnVsbAorKysgYi9nY2MvY29uZmlnL2FhcmNoNjQvYWFyY2g2NC1idWlsdGlu cy5oCkBAIC0wLDAgKzEsODUgQEAKKyNpZm5kZWYgQUFSQ0g2NF9CVUlMVElOU19ICisjZGVmaW5l IEFBUkNINjRfQlVJTFRJTlNfSAorCisjZGVmaW5lIEVOVFJZKEUsIE0sIFEsIEcpIEUsCitlbnVt IGFhcmNoNjRfc2ltZF90eXBlCit7CisjaW5jbHVkZSAiYWFyY2g2NC1zaW1kLWJ1aWx0aW4tdHlw ZXMuZGVmIgorICBBUk1fTkVPTl9IX1RZUEVTX0xBU1QKK307CisjdW5kZWYgRU5UUlkKKworZW51 bSBhYXJjaDY0X3R5cGVfcXVhbGlmaWVycworeworICAvKiBUIGZvby4gICovCisgIHF1YWxpZmll cl9ub25lID0gMHgwLAorICAvKiB1bnNpZ25lZCBUIGZvby4gICovCisgIHF1YWxpZmllcl91bnNp Z25lZCA9IDB4MSwgLyogMSA8PCAwICAqLworICAvKiBjb25zdCBUIGZvby4gICovCisgIHF1YWxp Zmllcl9jb25zdCA9IDB4MiwgLyogMSA8PCAxICAqLworICAvKiBUICpmb28uICAqLworICBxdWFs aWZpZXJfcG9pbnRlciA9IDB4NCwgLyogMSA8PCAyICAqLworICAvKiBVc2VkIHdoZW4gZXhwYW5k aW5nIGFyZ3VtZW50cyBpZiBhbiBvcGVyYW5kIGNvdWxkCisgICAgIGJlIGFuIGltbWVkaWF0ZS4g ICovCisgIHF1YWxpZmllcl9pbW1lZGlhdGUgPSAweDgsIC8qIDEgPDwgMyAgKi8KKyAgcXVhbGlm aWVyX21heWJlX2ltbWVkaWF0ZSA9IDB4MTAsIC8qIDEgPDwgNCAgKi8KKyAgLyogdm9pZCBmb28g KC4uLikuICAqLworICBxdWFsaWZpZXJfdm9pZCA9IDB4MjAsIC8qIDEgPDwgNSAgKi8KKyAgLyog U29tZSBwYXR0ZXJucyBtYXkgaGF2ZSBpbnRlcm5hbCBvcGVyYW5kcywgdGhpcyBxdWFsaWZpZXIg aXMgYW4KKyAgICAgaW5zdHJ1Y3Rpb24gdG8gdGhlIGluaXRpYWxpc2F0aW9uIGNvZGUgdG8gc2tp cCB0aGlzIG9wZXJhbmQuICAqLworICBxdWFsaWZpZXJfaW50ZXJuYWwgPSAweDQwLCAvKiAxIDw8 IDYgICovCisgIC8qIFNvbWUgYnVpbHRpbnMgc2hvdWxkIHVzZSB0aGUgVF8qbW9kZSogZW5jb2Rl ZCBpbiBhIHNpbWRfYnVpbHRpbl9kYXR1bQorICAgICByYXRoZXIgdGhhbiB1c2luZyB0aGUgdHlw ZSBvZiB0aGUgb3BlcmFuZC4gICovCisgIHF1YWxpZmllcl9tYXBfbW9kZSA9IDB4ODAsIC8qIDEg PDwgNyAgKi8KKyAgLyogcXVhbGlmaWVyX3BvaW50ZXIgfCBxdWFsaWZpZXJfbWFwX21vZGUgICov CisgIHF1YWxpZmllcl9wb2ludGVyX21hcF9tb2RlID0gMHg4NCwKKyAgLyogcXVhbGlmaWVyX2Nv bnN0IHwgcXVhbGlmaWVyX3BvaW50ZXIgfCBxdWFsaWZpZXJfbWFwX21vZGUgICovCisgIHF1YWxp Zmllcl9jb25zdF9wb2ludGVyX21hcF9tb2RlID0gMHg4NiwKKyAgLyogUG9seW5vbWlhbCB0eXBl cy4gICovCisgIHF1YWxpZmllcl9wb2x5ID0gMHgxMDAsCisgIC8qIExhbmUgaW5kaWNlcyAtIG11 c3QgYmUgaW4gcmFuZ2UsIGFuZCBmbGlwcGVkIGZvciBiaWdlbmRpYW4uICAqLworICBxdWFsaWZp ZXJfbGFuZV9pbmRleCA9IDB4MjAwLAorICAvKiBMYW5lIGluZGljZXMgZm9yIHNpbmdsZSBsYW5l IHN0cnVjdHVyZSBsb2FkcyBhbmQgc3RvcmVzLiAgKi8KKyAgcXVhbGlmaWVyX3N0cnVjdF9sb2Fk X3N0b3JlX2xhbmVfaW5kZXggPSAweDQwMCwKKyAgLyogTGFuZSBpbmRpY2VzIHNlbGVjdGVkIGlu IHBhaXJzLiAtIG11c3QgYmUgaW4gcmFuZ2UsIGFuZCBmbGlwcGVkIGZvcgorICAgICBiaWdlbmRp YW4uICAqLworICBxdWFsaWZpZXJfbGFuZV9wYWlyX2luZGV4ID0gMHg4MDAsCisgIC8qIExhbmUg aW5kaWNlcyBzZWxlY3RlZCBpbiBxdWFkdHVwbGV0cy4gLSBtdXN0IGJlIGluIHJhbmdlLCBhbmQg ZmxpcHBlZCBmb3IKKyAgICAgYmlnZW5kaWFuLiAgKi8KKyAgcXVhbGlmaWVyX2xhbmVfcXVhZHR1 cF9pbmRleCA9IDB4MTAwMCwKK307CisKK3N0cnVjdCBHVFkoKCkpIGFhcmNoNjRfc2ltZF90eXBl X2luZm8KK3sKKyAgZW51bSBhYXJjaDY0X3NpbWRfdHlwZSB0eXBlOworCisgIC8qIEludGVybmFs IHR5cGUgbmFtZS4gICovCisgIGNvbnN0IGNoYXIgKm5hbWU7CisKKyAgLyogSW50ZXJuYWwgdHlw ZSBuYW1lKG1hbmdsZWQpLiAgVGhlIG1hbmdsZWQgbmFtZXMgY29uZm9ybSB0byB0aGUKKyAgICAg QUFQQ1M2NCAoc2VlICJQcm9jZWR1cmUgQ2FsbCBTdGFuZGFyZCBmb3IgdGhlIEFSTSA2NC1iaXQg QXJjaGl0ZWN0dXJlIiwKKyAgICAgQXBwZW5kaXggQSkuICBUbyBxdWFsaWZ5IGZvciBlbWlzc2lv biB3aXRoIHRoZSBtYW5nbGVkIG5hbWVzIGRlZmluZWQgaW4KKyAgICAgdGhhdCBkb2N1bWVudCwg YSB2ZWN0b3IgdHlwZSBtdXN0IG5vdCBvbmx5IGJlIG9mIHRoZSBjb3JyZWN0IG1vZGUgYnV0IGFs c28KKyAgICAgYmUgb2YgdGhlIGNvcnJlY3QgaW50ZXJuYWwgQWR2U0lNRCB2ZWN0b3IgdHlwZSAo ZS5nLiBfX0ludDh4OF90KTsgdGhlc2UKKyAgICAgdHlwZXMgYXJlIHJlZ2lzdGVyZWQgYnkgYWFy Y2g2NF9pbml0X3NpbWRfYnVpbHRpbl90eXBlcyAoKS4gIEluIG90aGVyCisgICAgIHdvcmRzLCB2 ZWN0b3IgdHlwZXMgZGVmaW5lZCBpbiBvdGhlciB3YXlzIGUuZy4gdmlhIHZlY3Rvcl9zaXplIGF0 dHJpYnV0ZQorICAgICB3aWxsIGdldCBkZWZhdWx0IG1hbmdsZWQgbmFtZXMuICAqLworICBjb25z dCBjaGFyICptYW5nbGU7CisKKyAgLyogSW50ZXJuYWwgdHlwZS4gICovCisgIHRyZWUgaXR5cGU7 CisKKyAgLyogRWxlbWVudCB0eXBlLiAgKi8KKyAgdHJlZSBlbHR5cGU7CisKKyAgLyogTWFjaGlu ZSBtb2RlIHRoZSBpbnRlcm5hbCB0eXBlIG1hcHMgdG8uICAqLworICBlbnVtIG1hY2hpbmVfbW9k ZSBtb2RlOworCisgIC8qIFF1YWxpZmllcnMuICAqLworICBlbnVtIGFhcmNoNjRfdHlwZV9xdWFs aWZpZXJzIHE7Cit9OworCithYXJjaDY0X3NpbWRfdHlwZV9pbmZvIGFhcmNoNjRfZ2V0X3NpbWRf aW5mb19mb3JfdHlwZSAoZW51bSBhYXJjaDY0X3NpbWRfdHlwZSk7CisKKyNlbmRpZiAvKiBBQVJD SDY0X0JVSUxUSU5TX0ggKi8KKwpkaWZmIC0tZ2l0IGEvZ2NjL2NvbmZpZy9hYXJjaDY0L2FhcmNo NjQtc3ZlLWJ1aWx0aW5zLWJhc2UuY2MgYi9nY2MvY29uZmlnL2FhcmNoNjQvYWFyY2g2NC1zdmUt YnVpbHRpbnMtYmFzZS5jYwppbmRleCAwMmU0MmE3MWU1ZS4uNTFlNmMxYTljYzQgMTAwNjQ0Ci0t LSBhL2djYy9jb25maWcvYWFyY2g2NC9hYXJjaDY0LXN2ZS1idWlsdGlucy1iYXNlLmNjCisrKyBi L2djYy9jb25maWcvYWFyY2g2NC9hYXJjaDY0LXN2ZS1idWlsdGlucy1iYXNlLmNjCkBAIC00NCw2 ICs0NCwxNCBAQAogI2luY2x1ZGUgImFhcmNoNjQtc3ZlLWJ1aWx0aW5zLXNoYXBlcy5oIgogI2lu Y2x1ZGUgImFhcmNoNjQtc3ZlLWJ1aWx0aW5zLWJhc2UuaCIKICNpbmNsdWRlICJhYXJjaDY0LXN2 ZS1idWlsdGlucy1mdW5jdGlvbnMuaCIKKyNpbmNsdWRlICJhYXJjaDY0LWJ1aWx0aW5zLmgiCisj aW5jbHVkZSAiZ2ltcGxlLXNzYS5oIgorI2luY2x1ZGUgInRyZWUtcGhpbm9kZXMuaCIKKyNpbmNs dWRlICJ0cmVlLXNzYS1vcGVyYW5kcy5oIgorI2luY2x1ZGUgInNzYS1pdGVyYXRvcnMuaCIKKyNp bmNsdWRlICJzdHJpbmdwb29sLmgiCisjaW5jbHVkZSAidmFsdWUtcmFuZ2UuaCIKKyNpbmNsdWRl ICJ0cmVlLXNzYW5hbWVzLmgiCiAKIHVzaW5nIG5hbWVzcGFjZSBhYXJjaDY0X3N2ZTsKIApAQCAt MTIwNyw2ICsxMjE1LDU2IEBAIHB1YmxpYzoKICAgICBpbnNuX2NvZGUgaWNvZGUgPSBjb2RlX2Zv cl9hYXJjaDY0X3N2ZV9sZDFycSAoZS52ZWN0b3JfbW9kZSAoMCkpOwogICAgIHJldHVybiBlLnVz ZV9jb250aWd1b3VzX2xvYWRfaW5zbiAoaWNvZGUpOwogICB9CisKKyAgZ2ltcGxlICoKKyAgZm9s ZCAoZ2ltcGxlX2ZvbGRlciAmZikgY29uc3QgT1ZFUlJJREUKKyAgeworICAgIHRyZWUgYXJnMCA9 IGdpbXBsZV9jYWxsX2FyZyAoZi5jYWxsLCAwKTsKKyAgICB0cmVlIGFyZzEgPSBnaW1wbGVfY2Fs bF9hcmcgKGYuY2FsbCwgMSk7CisKKyAgICAvKiBUcmFuc2Zvcm06CisgICAgICAgbGhzID0gc3Zs ZDFycSAoey0xLCAtMSwgLi4uIH0sIGFyZzEpCisgICAgICAgaW50bzoKKyAgICAgICB0bXAgPSBt ZW1fcmVmPGludDMyeDRfdD4gWyhpbnQgKiB7cmVmLWFsbH0pIGFyZzFdIAorICAgICAgIGxocyA9 IHZlY19wZXJtX2V4cHI8dG1wLCB0bXAsIHswLCAxLCAyLCAzLCAuLi59Pi4KKyAgICAgICBvbiBs aXR0bGUgZW5kaWFuIHRhcmdldC4gICovCisKKyAgICBpZiAoIUJZVEVTX0JJR19FTkRJQU4KKwkm JiBpbnRlZ2VyX2FsbF9vbmVzcCAoYXJnMCkpCQorICAgICAgeworCXRyZWUgbGhzID0gZ2ltcGxl X2NhbGxfbGhzIChmLmNhbGwpOworCWF1dG8gc2ltZF90eXBlID0gYWFyY2g2NF9nZXRfc2ltZF9p bmZvX2Zvcl90eXBlIChJbnQzMng0X3QpOworCisJdHJlZSBlbHRfcHRyX3R5cGUKKwkgID0gYnVp bGRfcG9pbnRlcl90eXBlX2Zvcl9tb2RlIChzaW1kX3R5cGUuZWx0eXBlLCBWT0lEbW9kZSwgdHJ1 ZSk7CisJdHJlZSB6ZXJvID0gYnVpbGRfemVyb19jc3QgKGVsdF9wdHJfdHlwZSk7CisKKwkvKiBV c2UgZWxlbWVudCB0eXBlIGFsaWdubWVudC4gICovCisJdHJlZSBhY2Nlc3NfdHlwZQorCSAgPSBi dWlsZF9hbGlnbmVkX3R5cGUgKHNpbWRfdHlwZS5pdHlwZSwgVFlQRV9BTElHTiAoc2ltZF90eXBl LmVsdHlwZSkpOworCisJdHJlZSB0bXAgPSBtYWtlX3NzYV9uYW1lX2ZuIChjZnVuLCBhY2Nlc3Nf dHlwZSwgMCk7CisJZ2ltcGxlICptZW1fcmVmX3N0bXQKKwkgID0gZ2ltcGxlX2J1aWxkX2Fzc2ln biAodG1wLCBmb2xkX2J1aWxkMiAoTUVNX1JFRiwgYWNjZXNzX3R5cGUsIGFyZzEsIHplcm8pKTsK Kwlnc2lfaW5zZXJ0X2JlZm9yZSAoZi5nc2ksIG1lbV9yZWZfc3RtdCwgR1NJX1NBTUVfU1RNVCk7 CisKKwl0cmVlIG1lbV9yZWZfbGhzID0gZ2ltcGxlX2dldF9saHMgKG1lbV9yZWZfc3RtdCk7CisJ dHJlZSB2ZWN0eXBlID0gVFJFRV9UWVBFIChtZW1fcmVmX2xocyk7CisJdHJlZSBsaHNfdHlwZSA9 IFRSRUVfVFlQRSAobGhzKTsKKworCWludCBzb3VyY2VfbmVsdHMgPSBUWVBFX1ZFQ1RPUl9TVUJQ QVJUUyAodmVjdHlwZSkudG9fY29uc3RhbnQgKCk7CisJdmVjX3Blcm1fYnVpbGRlciBzZWwgKFRZ UEVfVkVDVE9SX1NVQlBBUlRTIChsaHNfdHlwZSksIHNvdXJjZV9uZWx0cywgMSk7CisJZm9yIChp bnQgaSA9IDA7IGkgPCBzb3VyY2VfbmVsdHM7IGkrKykKKwkgIHNlbC5xdWlja19wdXNoIChpKTsK KworCXZlY19wZXJtX2luZGljZXMgaW5kaWNlcyAoc2VsLCAxLCBzb3VyY2VfbmVsdHMpOworCWdj Y19jaGVja2luZ19hc3NlcnQgKGNhbl92ZWNfcGVybV9jb25zdF9wIChUWVBFX01PREUgKGxoc190 eXBlKSwgaW5kaWNlcykpOworCXRyZWUgbWFzayA9IHZlY19wZXJtX2luZGljZXNfdG9fdHJlZSAo bGhzX3R5cGUsIGluZGljZXMpOworCXJldHVybiBnaW1wbGVfYnVpbGRfYXNzaWduIChsaHMsIFZF Q19QRVJNX0VYUFIsIG1lbV9yZWZfbGhzLCBtZW1fcmVmX2xocywgbWFzayk7CisgICAgICB9CisK KyAgICByZXR1cm4gTlVMTDsKKyAgfQogfTsKIAogY2xhc3Mgc3ZsZDFyb19pbXBsIDogcHVibGlj IGxvYWRfcmVwbGljYXRlCmRpZmYgLS1naXQgYS9nY2MvY29uZmlnL2FhcmNoNjQvYWFyY2g2NC5j IGIvZ2NjL2NvbmZpZy9hYXJjaDY0L2FhcmNoNjQuYwppbmRleCBmMDczMzBjZmY0Zi4uZGM2ZTVj YTFlMWQgMTAwNjQ0Ci0tLSBhL2djYy9jb25maWcvYWFyY2g2NC9hYXJjaDY0LmMKKysrIGIvZ2Nj L2NvbmZpZy9hYXJjaDY0L2FhcmNoNjQuYwpAQCAtMjMwMDksNiArMjMwMDksMzUgQEAgYWFyY2g2 NF9ldnBjX3N2ZV90YmwgKHN0cnVjdCBleHBhbmRfdmVjX3Blcm1fZCAqZCkKICAgcmV0dXJuIHRy dWU7CiB9CiAKKy8qIFRyeSB0byBpbXBsZW1lbnQgRCB1c2luZyBTVkUgZHVwIGluc3RydWN0aW9u LiAgKi8KKworc3RhdGljIGJvb2wKK2FhcmNoNjRfZXZwY19zdmVfZHVwIChzdHJ1Y3QgZXhwYW5k X3ZlY19wZXJtX2QgKmQpCit7CisgIGlmIChCWVRFU19CSUdfRU5ESUFOCisgICAgICB8fCBkLT5w ZXJtLmxlbmd0aCAoKS5pc19jb25zdGFudCAoKQorICAgICAgfHwgIWQtPm9uZV92ZWN0b3JfcAor ICAgICAgfHwgZC0+dGFyZ2V0ID09IE5VTEwKKyAgICAgIHx8IGQtPm9wMCA9PSBOVUxMCisgICAg ICB8fCBHRVRfTU9ERV9OVU5JVFMgKEdFVF9NT0RFIChkLT50YXJnZXQpKS5pc19jb25zdGFudCAo KQorICAgICAgfHwgIUdFVF9NT0RFX05VTklUUyAoR0VUX01PREUgKGQtPm9wMCkpLmlzX2NvbnN0 YW50ICgpKQorICAgIHJldHVybiBmYWxzZTsKKworICBpZiAoZC0+dGVzdGluZ19wKQorICAgIHJl dHVybiB0cnVlOworCisgIGludCBucGF0dGVybnMgPSBkLT5wZXJtLmVuY29kaW5nICgpLm5wYXR0 ZXJucyAoKTsKKyAgaWYgKCFrbm93bl9lcSAobnBhdHRlcm5zLCBHRVRfTU9ERV9OVU5JVFMgKEdF VF9NT0RFIChkLT5vcDApKSkpCisgICAgcmV0dXJuIGZhbHNlOworCisgIGZvciAoaW50IGkgPSAw OyBpIDwgbnBhdHRlcm5zOyBpKyspCisgICAgaWYgKCFrbm93bl9lcSAoZC0+cGVybVtpXSwgaSkp CisgICAgICByZXR1cm4gZmFsc2U7CisKKyAgYWFyY2g2NF9leHBhbmRfc3ZlX2R1cHEgKGQtPnRh cmdldCwgR0VUX01PREUgKGQtPnRhcmdldCksIGQtPm9wMCk7CisgIHJldHVybiB0cnVlOworfQor CiAvKiBUcnkgdG8gaW1wbGVtZW50IEQgdXNpbmcgU1ZFIFNFTCBpbnN0cnVjdGlvbi4gICovCiAK IHN0YXRpYyBib29sCkBAIC0yMzE2OSw3ICsyMzE5OCwxMiBAQCBhYXJjaDY0X2V4cGFuZF92ZWNf cGVybV9jb25zdF8xIChzdHJ1Y3QgZXhwYW5kX3ZlY19wZXJtX2QgKmQpCiAgICAgICBlbHNlIGlm IChhYXJjaDY0X2V2cGNfcmVlbmNvZGUgKGQpKQogCXJldHVybiB0cnVlOwogICAgICAgaWYgKGQt PnZlY19mbGFncyA9PSBWRUNfU1ZFX0RBVEEpCi0JcmV0dXJuIGFhcmNoNjRfZXZwY19zdmVfdGJs IChkKTsKKyAgICAgICAgeworCSAgaWYgKGFhcmNoNjRfZXZwY19zdmVfZHVwIChkKSkKKwkgICAg cmV0dXJuIHRydWU7CisJICBlbHNlIGlmIChhYXJjaDY0X2V2cGNfc3ZlX3RibCAoZCkpCisJICAg IHJldHVybiB0cnVlOworCX0KICAgICAgIGVsc2UgaWYgKGQtPnZlY19mbGFncyA9PSBWRUNfQURW U0lNRCkKIAlyZXR1cm4gYWFyY2g2NF9ldnBjX3RibCAoZCk7CiAgICAgfQpkaWZmIC0tZ2l0IGEv Z2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L2FhcmNoNjQvc3ZlL2FjbGUvZ2VuZXJhbC9wcjk2NDYz LmMgYi9nY2MvdGVzdHN1aXRlL2djYy50YXJnZXQvYWFyY2g2NC9zdmUvYWNsZS9nZW5lcmFsL3By OTY0NjMuYwpuZXcgZmlsZSBtb2RlIDEwMDY0NAppbmRleCAwMDAwMDAwMDAwMC4uMzUxMDBhOWUw MWMKLS0tIC9kZXYvbnVsbAorKysgYi9nY2MvdGVzdHN1aXRlL2djYy50YXJnZXQvYWFyY2g2NC9z dmUvYWNsZS9nZW5lcmFsL3ByOTY0NjMuYwpAQCAtMCwwICsxLDE3IEBACisvKiB7IGRnLWRvIGNv bXBpbGUgfSAqLworLyogeyBkZy1vcHRpb25zICItTzMiIH0gKi8KKworI2luY2x1ZGUgImFybV9u ZW9uLmgiCisjaW5jbHVkZSAiYXJtX3N2ZS5oIgorCitzdmludDMyX3QgZjEgKGludDMyeDRfdCB4 KQoreworICByZXR1cm4gc3ZsZDFycSAoc3ZwdHJ1ZV9iOCAoKSwgJnhbMF0pOworfQorCitzdmlu dDMyX3QgZjIgKGludCAqeCkKK3sKKyAgcmV0dXJuIHN2bGQxcnEgKHN2cHRydWVfYjggKCksIHgp OworfQorCisvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXItdGltZXMge1x0ZHVwXHR6WzAt OV0rXC5xLCB6WzAtOV0rXC5xXFswXF19IDIgeyB0YXJnZXQgYWFyY2g2NF9saXR0bGVfZW5kaWFu IH0gfSB9ICovCg== --00000000000052213105d41e1f05--