From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x236.google.com (mail-lj1-x236.google.com [IPv6:2a00:1450:4864:20::236]) by sourceware.org (Postfix) with ESMTPS id 083003858C2D for ; Mon, 27 Nov 2023 15:14:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 083003858C2D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 083003858C2D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::236 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701098056; cv=none; b=ADqLKbaDHW6WZQH10lmBAjVdnILH2hPUeosq83ewLBgkZ1V5Yu6VdBdIcEQ+/zdtw9h6NixBZ3pC60lX4DHJGf2JSJ4LKyZhsQveB6Zn6jd6RLqfGK8MJmIlyXc711so406hheN/YHTLYXfJlSaOFWk8NQmnytto420Ii7rj0uQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701098056; c=relaxed/simple; bh=z8AVN1kSc2Xbxt2G7avj7bezaY4+IemlTQgt3L87oo4=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=pbL7G2VJb9LKQscWwQlN9XJAmAlRfvgvmiG0pf3kCDs6/tjS8xOUd2vWxg8cKwSbYeYMg3wc3SUMfQqfXVO9IfIYBTgnXnlxBdu8qiIRktwwvmOAjujHI5M5U0qrSK1dP8ViNYx0S3L7vl36UlgX1yZt80IceeetVCxTqjopCoQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lj1-x236.google.com with SMTP id 38308e7fff4ca-2c50fbc218bso53506191fa.3 for ; Mon, 27 Nov 2023 07:14:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1701098050; x=1701702850; darn=gcc.gnu.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=Sw1AGFkovWXnLhxZ1xicUeflyG6QWdVnpmyCNMW41bo=; b=teY1V5jZ+tJo4j+c/UNItWEUJPKZ3afH+jeTG57lOApcOXUqC7kVFj1AfRpDeIHF8L e2lV6gAUPmQz66tQSgAdpDp93fz1kPreVt2eHYldYiJWRVMa1hraE4WTSJh/GS2Vf2bj Z6CGqY6npOqP/yCHpQWwiaz5Y5RLj9qAq1b5DGxCVR4lkfrzemYXfilUhXa2b9hXaMZI /u+ZzRZKxUGAzR1yCYNVLODUQOUHnwqCdN0haER1IylP81IRfyThLaiHT7Wc3zZGGJXq L1dI+L1lTP4uuxzBuZvfzZvg4wdf4VyvT8x9CnjBdnCuRRkGLmyDd/nyS+nqZ6Gs0A0c t6EQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701098050; x=1701702850; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Sw1AGFkovWXnLhxZ1xicUeflyG6QWdVnpmyCNMW41bo=; b=wk3oFdldQj9C+DM0FnFUPNKm9gNLMaz3eGNAzhGc8zyaosW/1n1Fi89vV3HnJFmRGg b5WySvNEE6wGXss1Zj/e3uqIxjVKzWMyXQ17m/3F0hC708G7gp9yjEl6aq2DcD2U8nYa 09sISOUwKDEAJA1uRI8/tRnTnsb6PMibZ8wqzFOcdc25OQ0NEJrF4qofZ7YRjfJua2Wq ps2JK03+45f/HdXzll7JsNRFfu5BDZWAlHqQpqCbT+FJrgjgsxfkPM79GxU4MbyAHIcS fByv8lp4Ghmb+aK0mDY5nqjZCNxRkC1k1XryggnW6mxAk5NFUkMLNBFax5rU0rOBWLK3 PxeQ== X-Gm-Message-State: AOJu0YwIzUqWn7lDRbCCml7vpJ0dNBSfstNlP5hnQoAH6lMKepTajWn6 xEAVznPP0cJejsiiner/zMCXX1+6os6QFKRXfuspuVXN5sazyEH+ X-Google-Smtp-Source: AGHT+IHyF9110QsSFCy8dFVuof4SPH+tDriwiwMtTAzW1DZzZd9qJVdTJcIUFIOku5Lh30qDcTzGC/+HGAU0eBAi5lk= X-Received: by 2002:a2e:9e48:0:b0:2c6:eccb:344d with SMTP id g8-20020a2e9e48000000b002c6eccb344dmr7722144ljk.40.1701098049932; Mon, 27 Nov 2023 07:14:09 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Prathamesh Kulkarni Date: Mon, 27 Nov 2023 20:43:34 +0530 Message-ID: Subject: Re: PR111754 To: Prathamesh Kulkarni , gcc Patches , Richard Biener , richard.sandiford@arm.com Content-Type: multipart/mixed; boundary="000000000000ff5743060b23c0ff" X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_NUMSUBJECT,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000ff5743060b23c0ff Content-Type: text/plain; charset="UTF-8" On Fri, 24 Nov 2023 at 03:13, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > On Thu, 26 Oct 2023 at 09:43, Prathamesh Kulkarni > > wrote: > >> > >> On Thu, 26 Oct 2023 at 04:09, Richard Sandiford > >> wrote: > >> > > >> > Prathamesh Kulkarni writes: > >> > > On Wed, 25 Oct 2023 at 02:58, Richard Sandiford > >> > > wrote: > >> > >> So I think the PR could be solved by something like the attached. > >> > >> Do you agree? If so, could you base the patch on this instead? > >> > >> > >> > >> Only tested against the self-tests. > >> > >> > >> > >> Thanks, > >> > >> Richard > >> > >> > >> > >> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > >> > >> index 40767736389..00fce4945a7 100644 > >> > >> --- a/gcc/fold-const.cc > >> > >> +++ b/gcc/fold-const.cc > >> > >> @@ -10743,27 +10743,37 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > >> > >> unsigned res_npatterns, res_nelts_per_pattern; > >> > >> unsigned HOST_WIDE_INT res_nelts; > >> > >> > >> > >> - /* (1) If SEL is a suitable mask as determined by > >> > >> - valid_mask_for_fold_vec_perm_cst_p, then: > >> > >> - res_npatterns = max of npatterns between ARG0, ARG1, and SEL > >> > >> - res_nelts_per_pattern = max of nelts_per_pattern between > >> > >> - ARG0, ARG1 and SEL. > >> > >> - (2) If SEL is not a suitable mask, and TYPE is VLS then: > >> > >> - res_npatterns = nelts in result vector. > >> > >> - res_nelts_per_pattern = 1. > >> > >> - This exception is made so that VLS ARG0, ARG1 and SEL work as before. */ > >> > >> - if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > >> > >> - { > >> > >> - res_npatterns > >> > >> - = std::max (VECTOR_CST_NPATTERNS (arg0), > >> > >> - std::max (VECTOR_CST_NPATTERNS (arg1), > >> > >> - sel.encoding ().npatterns ())); > >> > >> + /* First try to implement the fold in a VLA-friendly way. > >> > >> + > >> > >> + (1) If the selector is simply a duplication of N elements, the > >> > >> + result is likewise a duplication of N elements. > >> > >> + > >> > >> + (2) If the selector is N elements followed by a duplication > >> > >> + of N elements, the result is too. > >> > >> > >> > >> - res_nelts_per_pattern > >> > >> - = std::max (VECTOR_CST_NELTS_PER_PATTERN (arg0), > >> > >> - std::max (VECTOR_CST_NELTS_PER_PATTERN (arg1), > >> > >> - sel.encoding ().nelts_per_pattern ())); > >> > >> + (3) If the selector is N elements followed by an interleaving > >> > >> + of N linear series, the situation is more complex. > >> > >> > >> > >> + valid_mask_for_fold_vec_perm_cst_p detects whether we > >> > >> + can handle this case. If we can, then each of the N linear > >> > >> + series either (a) selects the same element each time or > >> > >> + (b) selects a linear series from one of the input patterns. > >> > >> + > >> > >> + If (b) holds for one of the linear series, the result > >> > >> + will contain a linear series, and so the result will have > >> > >> + the same shape as the selector. If (a) holds for all of > >> > >> + the lienar series, the result will be the same as (2) above. > >> > >> + > >> > >> + (b) can only hold if one of the inputs pattern has a > >> > >> + stepped encoding. */ > >> > >> + if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > >> > >> + { > >> > >> + res_npatterns = sel.encoding ().npatterns (); > >> > >> + res_nelts_per_pattern = sel.encoding ().nelts_per_pattern (); > >> > >> + if (res_nelts_per_pattern == 3 > >> > >> + && VECTOR_CST_NELTS_PER_PATTERN (arg0) < 3 > >> > >> + && VECTOR_CST_NELTS_PER_PATTERN (arg1) < 3) > >> > >> + res_nelts_per_pattern = 2; > >> > > Um, in this case, should we set: > >> > > res_nelts_per_pattern = max (nelts_per_pattern (arg0), nelts_per_pattern(arg1)) > >> > > if both have nelts_per_pattern == 1 ? > >> > > >> > No, it still needs to be 2 even if arg0 and arg1 are duplicates. > >> > E.g. consider a selector that picks the first element of arg0 > >> > followed by a duplicate of the first element of arg1. > >> > > >> > > Also I suppose this matters only for non-integral element type, since > >> > > for integral element type, > >> > > vector_cst_elt will return the correct value even if the element is > >> > > not explicitly encoded and input vector is dup ? > >> > > >> > Yeah, but it might help even for integers. If we build fewer > >> > elements explicitly, and so read fewer implicitly-encoded inputs, > >> > there's less risk of running into: > >> > > >> > if (!can_div_trunc_p (sel[i], len, &q, &r)) > >> > { > >> > if (reason) > >> > *reason = "cannot divide selector element by arg len"; > >> > return NULL_TREE; > >> > } > >> Ah right, thanks for the clarification! > >> I am currently away on vacation and will return next Thursday, and > >> will post a follow up patch based on your patch. > >> Sorry for the delay. > > Hi, > > Sorry for slow response, I have rebased your patch and added couple of tests. > > The attached patch resulted in fallout for aarch64/sve/slp_3.c and > > aarch64/sve/slp_4.c. > > > > Specifically for slp_3.c, we didn't fold following case: > > arg0, arg1 are dup vectors. > > sel = { 0, len, 1, len + 1, 2, len + 2, ... } // (npatterns = 2, > > nelts_per_pattern = 3) > > because res_nelts_per_pattern was set to 3, and upon encountering 2, > > fold_vec_perm_cst returned false. > > > > With patch, we set res_nelts_per_pattern = 2 (since input vectors are > > dup), and thus gets folded to: > > res = { arg0[0], arg1[0], ... } // (2, 1) > > > > Which results in using ldrqd for loading the result instead of doing > > the permutation at runtime with mov and zip1. > > I have adjusted the tests for new code-gen. > > Does it look OK ? > > > > There's also this strange failure observed on x86_64, as well as on aarch64: > > New tests that FAIL (1 tests): > > libitm.c++/dropref.C -B > > /home/prathamesh.kulkarni/gnu-toolchain/gcc/gnu-964-5/bootstrap-build-after/aarch64-unknown-linux-gnu/./libitm/../libstdc++-v3/src/.libs > > execution test > > > > Looking at dropref.C: > > /* { dg-xfail-run-if "unsupported" { *-*-* } } */ > > #include > > > > char *pp; > > > > int main() > > { > > __transaction_atomic { > > _ITM_dropReferences (pp, 555); > > } > > return 0; > > } > > > > doesn't seem relevant to VEC_PERM_EXPR folding ? > > The patch otherwise passes bootstrap+test on aarch64-linux-gnu with > > and without SVE, and on x86_64-linux-gnu. > > > > Thanks, > > Prathamesh > >> > >> Thanks, > >> Prathamesh > >> > > >> > Thanks, > >> > Richard > > > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > > index 40767736389..75410869796 100644 > > --- a/gcc/fold-const.cc > > +++ b/gcc/fold-const.cc > > @@ -10743,27 +10743,38 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > unsigned res_npatterns, res_nelts_per_pattern; > > unsigned HOST_WIDE_INT res_nelts; > > > > - /* (1) If SEL is a suitable mask as determined by > > - valid_mask_for_fold_vec_perm_cst_p, then: > > - res_npatterns = max of npatterns between ARG0, ARG1, and SEL > > - res_nelts_per_pattern = max of nelts_per_pattern between > > - ARG0, ARG1 and SEL. > > - (2) If SEL is not a suitable mask, and TYPE is VLS then: > > - res_npatterns = nelts in result vector. > > - res_nelts_per_pattern = 1. > > - This exception is made so that VLS ARG0, ARG1 and SEL work as before. */ > > - if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > - { > > - res_npatterns > > - = std::max (VECTOR_CST_NPATTERNS (arg0), > > - std::max (VECTOR_CST_NPATTERNS (arg1), > > - sel.encoding ().npatterns ())); > > + /* First try to implement the fold in a VLA-friendly way. > > + > > + (1) If the selector is simply a duplication of N elements, the > > + result is likewise a duplication of N elements. > > + > > + (2) If the selector is N elements followed by a duplication > > + of N elements, the result is too. > > + > > + (3) If the selector is N elements followed by an interleaving > > + of N linear series, the situation is more complex. > > + > > + valid_mask_for_fold_vec_perm_cst_p detects whether we > > + can handle this case. If we can, then each of the N linear > > + series either (a) selects the same element each time or > > + (b) selects a linear series from one of the input patterns. > > + > > + If (b) holds for one of the linear series, the result > > + will contain a linear series, and so the result will have > > + the same shape as the selector. If (a) holds for all of > > + the lienar series, the result will be the same as (2) above. > > my typo: linear > > > > - res_nelts_per_pattern > > - = std::max (VECTOR_CST_NELTS_PER_PATTERN (arg0), > > - std::max (VECTOR_CST_NELTS_PER_PATTERN (arg1), > > - sel.encoding ().nelts_per_pattern ())); > > + (b) can only hold if one of the input patterns has a > > + stepped encoding. */ > > > > + if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > + { > > + res_npatterns = sel.encoding ().npatterns (); > > + res_nelts_per_pattern = sel.encoding ().nelts_per_pattern (); > > + if (res_nelts_per_pattern == 3 > > + && VECTOR_CST_NELTS_PER_PATTERN (arg0) < 3 > > + && VECTOR_CST_NELTS_PER_PATTERN (arg1) < 3) > > + res_nelts_per_pattern = 2; > > res_nelts = res_npatterns * res_nelts_per_pattern; > > } > > else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > @@ -17562,6 +17573,29 @@ test_nunits_min_2 (machine_mode vmode) > > tree expected_res[] = { ARG0(0), ARG1(0), ARG1(1) }; > > validate_res (1, 3, res, expected_res); > > } > > + > > + /* Case 8: Same as aarch64/sve/slp_3.c: > > + arg0, arg1 are dup vectors. > > + sel = { 0, len, 1, len+1, 2, len+2, ... } // (2, 3) > > + So res = { arg0[0], arg1[0], ... } // (2, 1) > > + > > + In this case, since the input vectors are dup, only the first two > > + elements per pattern in sel are considered significant. */ > > + { > > + tree arg0 = build_vec_cst_rand (vmode, 1, 1); > > + tree arg1 = build_vec_cst_rand (vmode, 1, 1); > > + poly_uint64 len = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0)); > > + > > + vec_perm_builder builder (len, 2, 3); > > + poly_uint64 mask_elems[] = { 0, len, 1, len + 1, 2, len + 2 }; > > + builder_push_elems (builder, mask_elems); > > + > > + vec_perm_indices sel (builder, 2, len); > > + tree res = fold_vec_perm_cst (TREE_TYPE (arg0), arg0, arg1, sel); > > + > > + tree expected_res[] = { ARG0(0), ARG1(0) }; > > + validate_res (2, 1, res, expected_res); > > + } > > } > > } > > > > @@ -17730,6 +17764,45 @@ test_nunits_min_4 (machine_mode vmode) > > ASSERT_TRUE (res == NULL_TREE); > > ASSERT_TRUE (!strcmp (reason, "step is not multiple of npatterns")); > > } > > + > > + /* Case 8: PR111754: When input vector is not a stepped sequence, > > + check that the result is not a stepped sequence either, even > > + if sel has a stepped sequence. */ > > + { > > + tree arg0 = build_vec_cst_rand (vmode, 1, 2); > > + tree arg1 = build_vec_cst_rand (vmode, 1, 2); > > + poly_uint64 len = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0)); > > + > > + vec_perm_builder builder (len, 1, 3); > > + poly_uint64 mask_elems[] = { 0, 1, 2 }; > > + builder_push_elems (builder, mask_elems); > > + > > + vec_perm_indices sel (builder, 2, len); > > + tree res = fold_vec_perm_cst (TREE_TYPE (arg0), arg0, arg1, sel); > > + > > + tree expected_res[] = { ARG0(0), ARG0(1) }; > > + validate_res (sel.encoding ().npatterns (), 2, res, expected_res); > > The test is OK, but I think it's worth noting that the fold_vec_perm_cst > arguments aren't canonical. Since sel selects only from the first input, > the canonical form would be: > > tree res = fold_vec_perm_cst (TREE_TYPE (arg0), arg0, arg0, sel); > > So OK with a comment, but also OK with the line above instead (and no arg1). > > > + } > > + > > + /* Case 9: If sel doesn't contain a stepped sequence, > > + check that the result has same encoding as sel, irrespective > > + of shape of input vectors. */ > > + { > > + tree arg0 = build_vec_cst_rand (vmode, 1, 3, 1); > > + tree arg1 = build_vec_cst_rand (vmode, 1, 3, 1); > > + poly_uint64 len = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0)); > > + > > + vec_perm_builder builder (len, 1, 2); > > + poly_uint64 mask_elems[] = { 0, len }; > > + builder_push_elems (builder, mask_elems); > > + > > + vec_perm_indices sel (builder, 2, len); > > + tree res = fold_vec_perm_cst (TREE_TYPE (arg0), arg0, arg1, sel); > > + > > + tree expected_res[] = { ARG0(0), ARG1(0) }; > > + validate_res (sel.encoding ().npatterns (), > > + sel.encoding ().nelts_per_pattern (), res, expected_res); > > + } > > } > > } > > > > diff --git a/gcc/testsuite/gcc.dg/vect/pr111754.c b/gcc/testsuite/gcc.dg/vect/pr111754.c > > new file mode 100644 > > index 00000000000..7c1c16875c7 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/pr111754.c > > @@ -0,0 +1,13 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > > + > > +typedef float __attribute__((__vector_size__ (16))) F; > > + > > +F foo (F a, F b) > > +{ > > + F v = (F) { 9 }; > > + return __builtin_shufflevector (v, v, 1, 0, 1, 2); > > +} > > + > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */ > > +/* { dg-final { scan-tree-dump "return \{ 0.0, 9.0e\\+0, 0.0, 0.0 \}" "optimized" } } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_3.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_3.c > > index 82dd43a4d98..cb649bc1aa9 100644 > > --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_3.c > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_3.c > > @@ -33,21 +33,15 @@ TEST_ALL (VEC_PERM) > > > > /* 1 for each 8-bit type. */ > > /* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 2 } } */ > > -/* 1 for each 16-bit type plus 1 for double. */ > > -/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 4 } } */ > > +/* 1 for each 16-bit type */ > > +/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 3 } } */ > > /* 1 for each 32-bit type. */ > > /* { dg-final { scan-assembler-times {\tld1rqw\tz[0-9]+\.s, } 3 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #41\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #25\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #31\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #62\n} 2 } } */ > > Let's replace the deleted lines with: > > /* { dg-final { scan-assembler-times {\tld1rqd\tz[0-9]+\.d, } 6 } } */ > > > -/* 3 for double. */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, x[0-9]+\n} 3 } } */ > > /* The 64-bit types need: > > > > ZIP1 ZIP1 (2 ZIP2s optimized away) > > This line should be deleted, now that the ZIP1s are gone. > > > ZIP1 ZIP2. */ > > -/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 9 } } */ > > +/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 3 } } */ > > /* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 3 } } */ > > > > /* The loop should be fully-masked. The 64-bit types need two loads > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_4.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_4.c > > index b1fa5e3cf68..ce940a28597 100644 > > --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_4.c > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_4.c > > @@ -35,20 +35,10 @@ vec_slp_##TYPE (TYPE *restrict a, int n) \ > > > > TEST_ALL (VEC_PERM) > > > > -/* 1 for each 8-bit type, 4 for each 32-bit type and 4 for double. */ > > -/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 18 } } */ > > +/* 1 for each 8-bit type */ > > +/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 2 } } */ > > /* 1 for each 16-bit type. */ > > /* { dg-final { scan-assembler-times {\tld1rqh\tz[0-9]+\.h, } 3 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #99\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #11\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #17\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #80\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #63\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #37\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #24\n} 2 } } */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #81\n} 2 } } */ > > -/* 4 for double. */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, x[0-9]+\n} 4 } } */ > > Similarly here: > > /* { dg-final { scan-assembler-times {\tld1rqd\tz[0-9]+\.d, } 18 } } */ > > > /* The 32-bit types need: > > > > ZIP1 ZIP1 (2 ZIP2s optimized away) > > This line should be deleted. > > > @@ -59,7 +49,7 @@ TEST_ALL (VEC_PERM) > > ZIP1 ZIP1 ZIP1 ZIP1 (4 ZIP2s optimized away) > > Same here. > > OK with those changes, and sorry for the slow review. Hi Richard, Thanks for the suggestions, I have done the changes in attached patch. Bootstrapped+tested with and without SVE on aarch64-linux-gnu and x86_64-linux-gnu. Which passes with exception of dropref.C failure above, but I assume that's spurious since it's not relevant to VEC_PERM_EXPR folding ? Is it OK to commit the patch to trunk ? Thanks, Prathamesh > > Thanks, > Richard > > > ZIP1 ZIP2 ZIP1 ZIP2 > > ZIP1 ZIP2 ZIP1 ZIP2. */ > > -/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 33 } } */ > > +/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 15 } } */ > > /* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 15 } } */ > > > > /* The loop should be fully-masked. The 32-bit types need two loads --000000000000ff5743060b23c0ff Content-Type: text/plain; charset="US-ASCII"; name="gnu-964-7.txt" Content-Disposition: attachment; filename="gnu-964-7.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lph1p51j0 UFIxMTE3NTQ6IFJld29yayBlbmNvZGluZyBvZiByZXN1bHQgZm9yIFZFQ19QRVJNX0VYUFIgd2l0 aCBjb25zdGFudCBpbnB1dCB2ZWN0b3JzLgoKZ2NjL0NoYW5nZUxvZzoKCVBSIG1pZGRsZS1lbmQv MTExNzU0CgkqIGZvbGQtY29uc3QuY2MgKGZvbGRfdmVjX3Blcm1fY3N0KTogU2V0IHJlc3VsdCdz IGVuY29kaW5nIHRvIHNlbCdzCgllbmNvZGluZywgYW5kIHNldCByZXNfbmVsdHNfcGVyX3BhdHRl cm4gdG8gMiBpZiBzZWwgY29udGFpbnMgc3RlcHBlZAoJc2VxdWVuY2UgYnV0IGlucHV0IHZlY3Rv cnMgZG8gbm90LgoJKHRlc3RfbnVuaXRzX21pbl8yKTogTmV3IHRlc3QgQ2FzZSA4LgoJKHRlc3Rf bnVuaXRzX21pbl80KTogTmV3IHRlc3RzIENhc2UgOCBhbmQgQ2FzZSA5LgoKZ2NjL3Rlc3RzdWl0 ZS9DaGFuZ2VMb2c6CglQUiBtaWRkbGUtZW5kLzExMTc1NAoJKiBnY2MudGFyZ2V0L2FhcmNoNjQv c3ZlL3NscF8zLmM6IEFkanVzdCBjb2RlLWdlbi4KCSogZ2NjLnRhcmdldC9hYXJjaDY0L3N2ZS9z bHBfNC5jOiBMaWtld2lzZS4KCSogZ2NjLmRnL3ZlY3QvcHIxMTE3NTQuYzogTmV3IHRlc3QuCgpD by1hdXRob3JlZC1ieTogUmljaGFyZCBTYW5kaWZvcmQgPHJpY2hhcmQuc2FuZGlmb3JkQGFybS5j b20+CgpkaWZmIC0tZ2l0IGEvZ2NjL2ZvbGQtY29uc3QuY2MgYi9nY2MvZm9sZC1jb25zdC5jYwpp bmRleCAzMzJiYzhhZWFkMi4uZGZmMDliODFmN2IgMTAwNjQ0Ci0tLSBhL2djYy9mb2xkLWNvbnN0 LmNjCisrKyBiL2djYy9mb2xkLWNvbnN0LmNjCkBAIC0xMDgwMywyNyArMTA4MDMsMzggQEAgZm9s ZF92ZWNfcGVybV9jc3QgKHRyZWUgdHlwZSwgdHJlZSBhcmcwLCB0cmVlIGFyZzEsIGNvbnN0IHZl Y19wZXJtX2luZGljZXMgJnNlbCwKICAgdW5zaWduZWQgcmVzX25wYXR0ZXJucywgcmVzX25lbHRz X3Blcl9wYXR0ZXJuOwogICB1bnNpZ25lZCBIT1NUX1dJREVfSU5UIHJlc19uZWx0czsKIAotICAv KiAoMSkgSWYgU0VMIGlzIGEgc3VpdGFibGUgbWFzayBhcyBkZXRlcm1pbmVkIGJ5Ci0gICAgIHZh bGlkX21hc2tfZm9yX2ZvbGRfdmVjX3Blcm1fY3N0X3AsIHRoZW46Ci0gICAgIHJlc19ucGF0dGVy bnMgPSBtYXggb2YgbnBhdHRlcm5zIGJldHdlZW4gQVJHMCwgQVJHMSwgYW5kIFNFTAotICAgICBy ZXNfbmVsdHNfcGVyX3BhdHRlcm4gPSBtYXggb2YgbmVsdHNfcGVyX3BhdHRlcm4gYmV0d2Vlbgot CQkJICAgICBBUkcwLCBBUkcxIGFuZCBTRUwuCi0gICAgICgyKSBJZiBTRUwgaXMgbm90IGEgc3Vp dGFibGUgbWFzaywgYW5kIFRZUEUgaXMgVkxTIHRoZW46Ci0gICAgIHJlc19ucGF0dGVybnMgPSBu ZWx0cyBpbiByZXN1bHQgdmVjdG9yLgotICAgICByZXNfbmVsdHNfcGVyX3BhdHRlcm4gPSAxLgot ICAgICBUaGlzIGV4Y2VwdGlvbiBpcyBtYWRlIHNvIHRoYXQgVkxTIEFSRzAsIEFSRzEgYW5kIFNF TCB3b3JrIGFzIGJlZm9yZS4gICovCi0gIGlmICh2YWxpZF9tYXNrX2Zvcl9mb2xkX3ZlY19wZXJt X2NzdF9wIChhcmcwLCBhcmcxLCBzZWwsIHJlYXNvbikpCi0gICAgewotICAgICAgcmVzX25wYXR0 ZXJucwotCT0gc3RkOjptYXggKFZFQ1RPUl9DU1RfTlBBVFRFUk5TIChhcmcwKSwKLQkJICAgIHN0 ZDo6bWF4IChWRUNUT1JfQ1NUX05QQVRURVJOUyAoYXJnMSksCi0JCQkgICAgICBzZWwuZW5jb2Rp bmcgKCkubnBhdHRlcm5zICgpKSk7CisgIC8qIEZpcnN0IHRyeSB0byBpbXBsZW1lbnQgdGhlIGZv bGQgaW4gYSBWTEEtZnJpZW5kbHkgd2F5LgorCisgICAgICgxKSBJZiB0aGUgc2VsZWN0b3IgaXMg c2ltcGx5IGEgZHVwbGljYXRpb24gb2YgTiBlbGVtZW50cywgdGhlCisJIHJlc3VsdCBpcyBsaWtl d2lzZSBhIGR1cGxpY2F0aW9uIG9mIE4gZWxlbWVudHMuCisKKyAgICAgKDIpIElmIHRoZSBzZWxl Y3RvciBpcyBOIGVsZW1lbnRzIGZvbGxvd2VkIGJ5IGEgZHVwbGljYXRpb24KKwkgb2YgTiBlbGVt ZW50cywgdGhlIHJlc3VsdCBpcyB0b28uCisKKyAgICAgKDMpIElmIHRoZSBzZWxlY3RvciBpcyBO IGVsZW1lbnRzIGZvbGxvd2VkIGJ5IGFuIGludGVybGVhdmluZworCSBvZiBOIGxpbmVhciBzZXJp ZXMsIHRoZSBzaXR1YXRpb24gaXMgbW9yZSBjb21wbGV4LgorCisJIHZhbGlkX21hc2tfZm9yX2Zv bGRfdmVjX3Blcm1fY3N0X3AgZGV0ZWN0cyB3aGV0aGVyIHdlCisJIGNhbiBoYW5kbGUgdGhpcyBj YXNlLiAgSWYgd2UgY2FuLCB0aGVuIGVhY2ggb2YgdGhlIE4gbGluZWFyCisJIHNlcmllcyBlaXRo ZXIgKGEpIHNlbGVjdHMgdGhlIHNhbWUgZWxlbWVudCBlYWNoIHRpbWUgb3IKKwkgKGIpIHNlbGVj dHMgYSBsaW5lYXIgc2VyaWVzIGZyb20gb25lIG9mIHRoZSBpbnB1dCBwYXR0ZXJucy4KIAotICAg ICAgcmVzX25lbHRzX3Blcl9wYXR0ZXJuCi0JPSBzdGQ6Om1heCAoVkVDVE9SX0NTVF9ORUxUU19Q RVJfUEFUVEVSTiAoYXJnMCksCi0JCSAgICBzdGQ6Om1heCAoVkVDVE9SX0NTVF9ORUxUU19QRVJf UEFUVEVSTiAoYXJnMSksCi0JCQkgICAgICBzZWwuZW5jb2RpbmcgKCkubmVsdHNfcGVyX3BhdHRl cm4gKCkpKTsKKwkgSWYgKGIpIGhvbGRzIGZvciBvbmUgb2YgdGhlIGxpbmVhciBzZXJpZXMsIHRo ZSByZXN1bHQKKwkgd2lsbCBjb250YWluIGEgbGluZWFyIHNlcmllcywgYW5kIHNvIHRoZSByZXN1 bHQgd2lsbCBoYXZlCisJIHRoZSBzYW1lIHNoYXBlIGFzIHRoZSBzZWxlY3Rvci4gIElmIChhKSBo b2xkcyBmb3IgYWxsIG9mCisJIHRoZSBsaW5lYXIgc2VyaWVzLCB0aGUgcmVzdWx0IHdpbGwgYmUg dGhlIHNhbWUgYXMgKDIpIGFib3ZlLgogCisJIChiKSBjYW4gb25seSBob2xkIGlmIG9uZSBvZiB0 aGUgaW5wdXQgcGF0dGVybnMgaGFzIGEKKwkgc3RlcHBlZCBlbmNvZGluZy4gICovCisKKyAgaWYg KHZhbGlkX21hc2tfZm9yX2ZvbGRfdmVjX3Blcm1fY3N0X3AgKGFyZzAsIGFyZzEsIHNlbCwgcmVh c29uKSkKKyAgICB7CisgICAgICByZXNfbnBhdHRlcm5zID0gc2VsLmVuY29kaW5nICgpLm5wYXR0 ZXJucyAoKTsKKyAgICAgIHJlc19uZWx0c19wZXJfcGF0dGVybiA9IHNlbC5lbmNvZGluZyAoKS5u ZWx0c19wZXJfcGF0dGVybiAoKTsKKyAgICAgIGlmIChyZXNfbmVsdHNfcGVyX3BhdHRlcm4gPT0g MworCSAgJiYgVkVDVE9SX0NTVF9ORUxUU19QRVJfUEFUVEVSTiAoYXJnMCkgPCAzCisJICAmJiBW RUNUT1JfQ1NUX05FTFRTX1BFUl9QQVRURVJOIChhcmcxKSA8IDMpCisJcmVzX25lbHRzX3Blcl9w YXR0ZXJuID0gMjsKICAgICAgIHJlc19uZWx0cyA9IHJlc19ucGF0dGVybnMgKiByZXNfbmVsdHNf cGVyX3BhdHRlcm47CiAgICAgfQogICBlbHNlIGlmIChUWVBFX1ZFQ1RPUl9TVUJQQVJUUyAodHlw ZSkuaXNfY29uc3RhbnQgKCZyZXNfbmVsdHMpKQpAQCAtMTc2MjIsNiArMTc2MzMsMjkgQEAgdGVz dF9udW5pdHNfbWluXzIgKG1hY2hpbmVfbW9kZSB2bW9kZSkKIAl0cmVlIGV4cGVjdGVkX3Jlc1td ID0geyBBUkcwKDApLCBBUkcxKDApLCBBUkcxKDEpIH07CiAJdmFsaWRhdGVfcmVzICgxLCAzLCBy ZXMsIGV4cGVjdGVkX3Jlcyk7CiAgICAgICB9CisKKyAgICAgIC8qIENhc2UgODogU2FtZSBhcyBh YXJjaDY0L3N2ZS9zbHBfMy5jOgorCSBhcmcwLCBhcmcxIGFyZSBkdXAgdmVjdG9ycy4KKwkgc2Vs ID0geyAwLCBsZW4sIDEsIGxlbisxLCAyLCBsZW4rMiwgLi4uIH0gLy8gKDIsIDMpCisJIFNvIHJl cyA9IHsgYXJnMFswXSwgYXJnMVswXSwgLi4uIH0gLy8gKDIsIDEpCisKKwkgSW4gdGhpcyBjYXNl LCBzaW5jZSB0aGUgaW5wdXQgdmVjdG9ycyBhcmUgZHVwLCBvbmx5IHRoZSBmaXJzdCB0d28KKwkg ZWxlbWVudHMgcGVyIHBhdHRlcm4gaW4gc2VsIGFyZSBjb25zaWRlcmVkIHNpZ25pZmljYW50LiAg Ki8KKyAgICAgIHsKKwl0cmVlIGFyZzAgPSBidWlsZF92ZWNfY3N0X3JhbmQgKHZtb2RlLCAxLCAx KTsKKwl0cmVlIGFyZzEgPSBidWlsZF92ZWNfY3N0X3JhbmQgKHZtb2RlLCAxLCAxKTsKKwlwb2x5 X3VpbnQ2NCBsZW4gPSBUWVBFX1ZFQ1RPUl9TVUJQQVJUUyAoVFJFRV9UWVBFIChhcmcwKSk7CisK Kwl2ZWNfcGVybV9idWlsZGVyIGJ1aWxkZXIgKGxlbiwgMiwgMyk7CisJcG9seV91aW50NjQgbWFz a19lbGVtc1tdID0geyAwLCBsZW4sIDEsIGxlbiArIDEsIDIsIGxlbiArIDIgfTsKKwlidWlsZGVy X3B1c2hfZWxlbXMgKGJ1aWxkZXIsIG1hc2tfZWxlbXMpOworCisJdmVjX3Blcm1faW5kaWNlcyBz ZWwgKGJ1aWxkZXIsIDIsIGxlbik7CisJdHJlZSByZXMgPSBmb2xkX3ZlY19wZXJtX2NzdCAoVFJF RV9UWVBFIChhcmcwKSwgYXJnMCwgYXJnMSwgc2VsKTsKKworCXRyZWUgZXhwZWN0ZWRfcmVzW10g PSB7IEFSRzAoMCksIEFSRzEoMCkgfTsKKwl2YWxpZGF0ZV9yZXMgKDIsIDEsIHJlcywgZXhwZWN0 ZWRfcmVzKTsKKyAgICAgIH0KICAgICB9CiB9CiAKQEAgLTE3NzkwLDYgKzE3ODI0LDQ0IEBAIHRl c3RfbnVuaXRzX21pbl80IChtYWNoaW5lX21vZGUgdm1vZGUpCiAJQVNTRVJUX1RSVUUgKHJlcyA9 PSBOVUxMX1RSRUUpOwogCUFTU0VSVF9UUlVFICghc3RyY21wIChyZWFzb24sICJzdGVwIGlzIG5v dCBtdWx0aXBsZSBvZiBucGF0dGVybnMiKSk7CiAgICAgICB9CisKKyAgICAgIC8qIENhc2UgODog UFIxMTE3NTQ6IFdoZW4gaW5wdXQgdmVjdG9yIGlzIG5vdCBhIHN0ZXBwZWQgc2VxdWVuY2UsCisJ IGNoZWNrIHRoYXQgdGhlIHJlc3VsdCBpcyBub3QgYSBzdGVwcGVkIHNlcXVlbmNlIGVpdGhlciwg ZXZlbgorCSBpZiBzZWwgaGFzIGEgc3RlcHBlZCBzZXF1ZW5jZS4gICovCisgICAgICB7CisJdHJl ZSBhcmcwID0gYnVpbGRfdmVjX2NzdF9yYW5kICh2bW9kZSwgMSwgMik7CisJcG9seV91aW50NjQg bGVuID0gVFlQRV9WRUNUT1JfU1VCUEFSVFMgKFRSRUVfVFlQRSAoYXJnMCkpOworCisJdmVjX3Bl cm1fYnVpbGRlciBidWlsZGVyIChsZW4sIDEsIDMpOworCXBvbHlfdWludDY0IG1hc2tfZWxlbXNb XSA9IHsgMCwgMSwgMiB9OworCWJ1aWxkZXJfcHVzaF9lbGVtcyAoYnVpbGRlciwgbWFza19lbGVt cyk7CisKKwl2ZWNfcGVybV9pbmRpY2VzIHNlbCAoYnVpbGRlciwgMSwgbGVuKTsKKwl0cmVlIHJl cyA9IGZvbGRfdmVjX3Blcm1fY3N0IChUUkVFX1RZUEUgKGFyZzApLCBhcmcwLCBhcmcwLCBzZWwp OworCisJdHJlZSBleHBlY3RlZF9yZXNbXSA9IHsgQVJHMCgwKSwgQVJHMCgxKSB9OworCXZhbGlk YXRlX3JlcyAoc2VsLmVuY29kaW5nICgpLm5wYXR0ZXJucyAoKSwgMiwgcmVzLCBleHBlY3RlZF9y ZXMpOworICAgICAgfQorCisgICAgICAvKiBDYXNlIDk6IElmIHNlbCBkb2Vzbid0IGNvbnRhaW4g YSBzdGVwcGVkIHNlcXVlbmNlLAorCSBjaGVjayB0aGF0IHRoZSByZXN1bHQgaGFzIHNhbWUgZW5j b2RpbmcgYXMgc2VsLCBpcnJlc3BlY3RpdmUKKwkgb2Ygc2hhcGUgb2YgaW5wdXQgdmVjdG9ycy4g ICovCisgICAgICB7CisJdHJlZSBhcmcwID0gYnVpbGRfdmVjX2NzdF9yYW5kICh2bW9kZSwgMSwg MywgMSk7CisJdHJlZSBhcmcxID0gYnVpbGRfdmVjX2NzdF9yYW5kICh2bW9kZSwgMSwgMywgMSk7 CisJcG9seV91aW50NjQgbGVuID0gVFlQRV9WRUNUT1JfU1VCUEFSVFMgKFRSRUVfVFlQRSAoYXJn MCkpOworCisJdmVjX3Blcm1fYnVpbGRlciBidWlsZGVyIChsZW4sIDEsIDIpOworCXBvbHlfdWlu dDY0IG1hc2tfZWxlbXNbXSA9IHsgMCwgbGVuIH07CisJYnVpbGRlcl9wdXNoX2VsZW1zIChidWls ZGVyLCBtYXNrX2VsZW1zKTsKKworCXZlY19wZXJtX2luZGljZXMgc2VsIChidWlsZGVyLCAyLCBs ZW4pOworCXRyZWUgcmVzID0gZm9sZF92ZWNfcGVybV9jc3QgKFRSRUVfVFlQRSAoYXJnMCksIGFy ZzAsIGFyZzEsIHNlbCk7CisKKwl0cmVlIGV4cGVjdGVkX3Jlc1tdID0geyBBUkcwKDApLCBBUkcx KDApIH07CisJdmFsaWRhdGVfcmVzIChzZWwuZW5jb2RpbmcgKCkubnBhdHRlcm5zICgpLAorCQkg ICAgICBzZWwuZW5jb2RpbmcgKCkubmVsdHNfcGVyX3BhdHRlcm4gKCksIHJlcywgZXhwZWN0ZWRf cmVzKTsKKyAgICAgIH0KICAgICB9CiB9CiAKZGlmZiAtLWdpdCBhL2djYy90ZXN0c3VpdGUvZ2Nj LmRnL3ZlY3QvcHIxMTE3NTQuYyBiL2djYy90ZXN0c3VpdGUvZ2NjLmRnL3ZlY3QvcHIxMTE3NTQu YwpuZXcgZmlsZSBtb2RlIDEwMDY0NAppbmRleCAwMDAwMDAwMDAwMC4uN2MxYzE2ODc1YzcKLS0t IC9kZXYvbnVsbAorKysgYi9nY2MvdGVzdHN1aXRlL2djYy5kZy92ZWN0L3ByMTExNzU0LmMKQEAg LTAsMCArMSwxMyBAQAorLyogeyBkZy1kbyBjb21waWxlIH0gKi8KKy8qIHsgZGctb3B0aW9ucyAi LU8yIC1mZHVtcC10cmVlLW9wdGltaXplZCIgfSAqLworCit0eXBlZGVmIGZsb2F0IF9fYXR0cmli dXRlX18oKF9fdmVjdG9yX3NpemVfXyAoMTYpKSkgRjsKKworRiBmb28gKEYgYSwgRiBiKQorewor ICBGIHYgPSAoRikgeyA5IH07CisgIHJldHVybiBfX2J1aWx0aW5fc2h1ZmZsZXZlY3RvciAodiwg diwgMSwgMCwgMSwgMik7Cit9CisKKy8qIHsgZGctZmluYWwgeyBzY2FuLXRyZWUtZHVtcC1ub3Qg IlZFQ19QRVJNX0VYUFIiICJvcHRpbWl6ZWQiIH0gfSAqLworLyogeyBkZy1maW5hbCB7IHNjYW4t dHJlZS1kdW1wICJyZXR1cm4gXHsgMC4wLCA5LjBlXFwrMCwgMC4wLCAwLjAgXH0iICJvcHRpbWl6 ZWQiIH0gfSAqLwpkaWZmIC0tZ2l0IGEvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L2FhcmNoNjQv c3ZlL3NscF8zLmMgYi9nY2MvdGVzdHN1aXRlL2djYy50YXJnZXQvYWFyY2g2NC9zdmUvc2xwXzMu YwppbmRleCA4MmRkNDNhNGQ5OC4uNzc1YzFlMWQ1MzAgMTAwNjQ0Ci0tLSBhL2djYy90ZXN0c3Vp dGUvZ2NjLnRhcmdldC9hYXJjaDY0L3N2ZS9zbHBfMy5jCisrKyBiL2djYy90ZXN0c3VpdGUvZ2Nj LnRhcmdldC9hYXJjaDY0L3N2ZS9zbHBfMy5jCkBAIC0zMywyMSArMzMsMTQgQEAgVEVTVF9BTEwg KFZFQ19QRVJNKQogCiAvKiAxIGZvciBlYWNoIDgtYml0IHR5cGUuICAqLwogLyogeyBkZy1maW5h bCB7IHNjYW4tYXNzZW1ibGVyLXRpbWVzIHtcdGxkMXJ3XHR6WzAtOV0rXC5zLCB9IDIgfSB9ICov Ci0vKiAxIGZvciBlYWNoIDE2LWJpdCB0eXBlIHBsdXMgMSBmb3IgZG91YmxlLiAgKi8KLS8qIHsg ZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHRsZDFyZFx0elswLTldK1wuZCwgfSA0 IH0gfSAqLworLyogMSBmb3IgZWFjaCAxNi1iaXQgdHlwZSAgKi8KKy8qIHsgZGctZmluYWwgeyBz Y2FuLWFzc2VtYmxlci10aW1lcyB7XHRsZDFyZFx0elswLTldK1wuZCwgfSAzIH0gfSAqLwogLyog MSBmb3IgZWFjaCAzMi1iaXQgdHlwZS4gICovCiAvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJs ZXItdGltZXMge1x0bGQxcnF3XHR6WzAtOV0rXC5zLCB9IDMgfSB9ICovCi0vKiB7IGRnLWZpbmFs IHsgc2Nhbi1hc3NlbWJsZXItdGltZXMge1x0bW92XHR6WzAtOV0rXC5kLCAjNDFcbn0gMiB9IH0g Ki8KLS8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHRtb3ZcdHpbMC05XStc LmQsICMyNVxufSAyIH0gfSAqLwotLyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVyLXRpbWVz IHtcdG1vdlx0elswLTldK1wuZCwgIzMxXG59IDIgfSB9ICovCi0vKiB7IGRnLWZpbmFsIHsgc2Nh bi1hc3NlbWJsZXItdGltZXMge1x0bW92XHR6WzAtOV0rXC5kLCAjNjJcbn0gMiB9IH0gKi8KLS8q IDMgZm9yIGRvdWJsZS4gICovCi0vKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXItdGltZXMg e1x0bW92XHR6WzAtOV0rXC5kLCB4WzAtOV0rXG59IDMgfSB9ICovCisvKiB7IGRnLWZpbmFsIHsg c2Nhbi1hc3NlbWJsZXItdGltZXMge1x0bGQxcnFkXHR6WzAtOV0rXC5kLCB9IDYgfSB9ICovCiAv KiBUaGUgNjQtYml0IHR5cGVzIG5lZWQ6Ci0KLSAgICAgIFpJUDEgWklQMSAoMiBaSVAycyBvcHRp bWl6ZWQgYXdheSkKICAgICAgIFpJUDEgWklQMi4gICovCi0vKiB7IGRnLWZpbmFsIHsgc2Nhbi1h c3NlbWJsZXItdGltZXMge1x0emlwMVx0elswLTldK1wuZCwgelswLTldK1wuZCwgelswLTldK1wu ZFxufSA5IH0gfSAqLworLyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVyLXRpbWVzIHtcdHpp cDFcdHpbMC05XStcLmQsIHpbMC05XStcLmQsIHpbMC05XStcLmRcbn0gMyB9IH0gKi8KIC8qIHsg ZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHR6aXAyXHR6WzAtOV0rXC5kLCB6WzAt OV0rXC5kLCB6WzAtOV0rXC5kXG59IDMgfSB9ICovCiAKIC8qIFRoZSBsb29wIHNob3VsZCBiZSBm dWxseS1tYXNrZWQuICBUaGUgNjQtYml0IHR5cGVzIG5lZWQgdHdvIGxvYWRzCmRpZmYgLS1naXQg YS9nY2MvdGVzdHN1aXRlL2djYy50YXJnZXQvYWFyY2g2NC9zdmUvc2xwXzQuYyBiL2djYy90ZXN0 c3VpdGUvZ2NjLnRhcmdldC9hYXJjaDY0L3N2ZS9zbHBfNC5jCmluZGV4IGIxZmE1ZTNjZjY4Li41 YTlmYzhmZjc1MCAxMDA2NDQKLS0tIGEvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L2FhcmNoNjQv c3ZlL3NscF80LmMKKysrIGIvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L2FhcmNoNjQvc3ZlL3Ns cF80LmMKQEAgLTM1LDMxICszNSwyMCBAQCB2ZWNfc2xwXyMjVFlQRSAoVFlQRSAqcmVzdHJpY3Qg YSwgaW50IG4pCQkJXAogCiBURVNUX0FMTCAoVkVDX1BFUk0pCiAKLS8qIDEgZm9yIGVhY2ggOC1i aXQgdHlwZSwgNCBmb3IgZWFjaCAzMi1iaXQgdHlwZSBhbmQgNCBmb3IgZG91YmxlLiAgKi8KLS8q IHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHRsZDFyZFx0elswLTldK1wuZCwg fSAxOCB9IH0gKi8KKy8qIDEgZm9yIGVhY2ggOC1iaXQgdHlwZSAgKi8KKy8qIHsgZGctZmluYWwg eyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHRsZDFyZFx0elswLTldK1wuZCwgfSAyIH0gfSAqLwog LyogMSBmb3IgZWFjaCAxNi1iaXQgdHlwZS4gICovCiAvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3Nl bWJsZXItdGltZXMge1x0bGQxcnFoXHR6WzAtOV0rXC5oLCB9IDMgfSB9ICovCi0vKiB7IGRnLWZp bmFsIHsgc2Nhbi1hc3NlbWJsZXItdGltZXMge1x0bW92XHR6WzAtOV0rXC5kLCAjOTlcbn0gMiB9 IH0gKi8KLS8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHRtb3ZcdHpbMC05 XStcLmQsICMxMVxufSAyIH0gfSAqLwotLyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVyLXRp bWVzIHtcdG1vdlx0elswLTldK1wuZCwgIzE3XG59IDIgfSB9ICovCi0vKiB7IGRnLWZpbmFsIHsg c2Nhbi1hc3NlbWJsZXItdGltZXMge1x0bW92XHR6WzAtOV0rXC5kLCAjODBcbn0gMiB9IH0gKi8K LS8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHRtb3ZcdHpbMC05XStcLmQs ICM2M1xufSAyIH0gfSAqLwotLyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVyLXRpbWVzIHtc dG1vdlx0elswLTldK1wuZCwgIzM3XG59IDIgfSB9ICovCi0vKiB7IGRnLWZpbmFsIHsgc2Nhbi1h c3NlbWJsZXItdGltZXMge1x0bW92XHR6WzAtOV0rXC5kLCAjMjRcbn0gMiB9IH0gKi8KLS8qIHsg ZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHRtb3ZcdHpbMC05XStcLmQsICM4MVxu fSAyIH0gfSAqLwotLyogNCBmb3IgZG91YmxlLiAgKi8KLS8qIHsgZGctZmluYWwgeyBzY2FuLWFz c2VtYmxlci10aW1lcyB7XHRtb3ZcdHpbMC05XStcLmQsIHhbMC05XStcbn0gNCB9IH0gKi8KKy8q IHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHRsZDFycWRcdHpbMC05XStcLmQs IH0gMTggfSB9ICovCiAvKiBUaGUgMzItYml0IHR5cGVzIG5lZWQ6CiAKLSAgICAgIFpJUDEgWklQ MSAoMiBaSVAycyBvcHRpbWl6ZWQgYXdheSkKICAgICAgIFpJUDEgWklQMgogCiAgICBhbmQgdGhl IDY0LWJpdCB0eXBlcyBuZWVkOgogCi0gICAgICBaSVAxIFpJUDEgWklQMSBaSVAxICg0IFpJUDJz IG9wdGltaXplZCBhd2F5KQogICAgICAgWklQMSBaSVAyIFpJUDEgWklQMgogICAgICAgWklQMSBa SVAyIFpJUDEgWklQMi4gICovCi0vKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXItdGltZXMg e1x0emlwMVx0elswLTldK1wuZCwgelswLTldK1wuZCwgelswLTldK1wuZFxufSAzMyB9IH0gKi8K Ky8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XHR6aXAxXHR6WzAtOV0rXC5k LCB6WzAtOV0rXC5kLCB6WzAtOV0rXC5kXG59IDE1IH0gfSAqLwogLyogeyBkZy1maW5hbCB7IHNj YW4tYXNzZW1ibGVyLXRpbWVzIHtcdHppcDJcdHpbMC05XStcLmQsIHpbMC05XStcLmQsIHpbMC05 XStcLmRcbn0gMTUgfSB9ICovCiAKIC8qIFRoZSBsb29wIHNob3VsZCBiZSBmdWxseS1tYXNrZWQu ICBUaGUgMzItYml0IHR5cGVzIG5lZWQgdHdvIGxvYWRzCg== --000000000000ff5743060b23c0ff--