From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 353613839C74 for ; Wed, 4 Aug 2021 02:36:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 353613839C74 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 1742YEJM101070; Tue, 3 Aug 2021 22:36:28 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3a7grp1pae-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 03 Aug 2021 22:36:28 -0400 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1742YWhn102320; Tue, 3 Aug 2021 22:36:27 -0400 Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com with ESMTP id 3a7grp1pa1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 03 Aug 2021 22:36:27 -0400 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1742I7QY001398; Wed, 4 Aug 2021 02:36:25 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma04fra.de.ibm.com with ESMTP id 3a4x597gg8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 04 Aug 2021 02:36:24 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1742XPd154329770 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 4 Aug 2021 02:33:25 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2F1E411C04C; Wed, 4 Aug 2021 02:36:22 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 41FAD11C058; Wed, 4 Aug 2021 02:36:20 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.147.34]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 4 Aug 2021 02:36:19 +0000 (GMT) Subject: [PATCH v3] Make loops_list support an optional loop_p root To: Richard Biener Cc: GCC Patches , Segher Boessenkool , Martin Sebor , Bill Schmidt References: <0a8b77ba-1d54-1eff-b54d-d2cb1e769e09@linux.ibm.com> <61ac669c-7293-f53a-20c7-158b5a813cee@linux.ibm.com> <221d8a67-264a-b6a9-e705-bfb4a45f14bb@linux.ibm.com> From: "Kewen.Lin" Message-ID: <963b4fca-8ce6-c9d7-0b08-8431fa433322@linux.ibm.com> Date: Wed, 4 Aug 2021 10:36:16 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------2202FFB7F0F83BB6766C60F6" Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 9XBV3OnDnVvXiNUux7gO4UML19kGAJcr X-Proofpoint-ORIG-GUID: XgXBp8XSTwrXC0ILL2ocZQ6N56W66_JY X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-08-03_08:2021-08-03, 2021-08-03 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 spamscore=0 phishscore=0 mlxscore=0 clxscore=1015 impostorscore=0 bulkscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 lowpriorityscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2108040012 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 02:36:32 -0000 This is a multi-part message in MIME format. --------------2202FFB7F0F83BB6766C60F6 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit on 2021/8/3 下午8:08, Richard Biener wrote: > On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin wrote: >> >> on 2021/7/29 下午4:01, Richard Biener wrote: >>> On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin wrote: >>>> >>>> on 2021/7/22 下午8:56, Richard Biener wrote: >>>>> On Tue, Jul 20, 2021 at 4:37 >>>>> PM Kewen.Lin wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> This v2 has addressed some review comments/suggestions: >>>>>> >>>>>> - Use "!=" instead of "<" in function operator!= (const Iter &rhs) >>>>>> - Add new CTOR loops_list (struct loops *loops, unsigned flags) >>>>>> to support loop hierarchy tree rather than just a function, >>>>>> and adjust to use loops* accordingly. >>>>> >>>>> I actually meant struct loop *, not struct loops * ;) At the point >>>>> we pondered to make loop invariant motion work on single >>>>> loop nests we gave up not only but also because it iterates >>>>> over the loop nest but all the iterators only ever can process >>>>> all loops, not say, all loops inside a specific 'loop' (and >>>>> including that 'loop' if LI_INCLUDE_ROOT). So the >>>>> CTOR would take the 'root' of the loop tree as argument. >>>>> >>>>> I see that doesn't trivially fit how loops_list works, at least >>>>> not for LI_ONLY_INNERMOST. But I guess FROM_INNERMOST >>>>> could be adjusted to do ONLY_INNERMOST as well? >>>>> >>>> >>>> >>>> Thanks for the clarification! I just realized that the previous >>>> version with struct loops* is problematic, all traversal is >>>> still bounded with outer_loop == NULL. I think what you expect >>>> is to respect the given loop_p root boundary. Since we just >>>> record the loops' nums, I think we still need the function* fn? >>> >>> Would it simplify things if we recorded the actual loop *? >>> >> >> I'm afraid it's unsafe to record the loop*. I had the same >> question why the loop iterator uses index rather than loop* when >> I read this at the first time. I guess the design of processing >> loops allows its user to update or even delete the folllowing >> loops to be visited. For example, when the user does some tricks >> on one loop, then it duplicates the loop and its children to >> somewhere and then removes the loop and its children, when >> iterating onto its children later, the "index" way will check its >> validity by get_loop at that point, but the "loop *" way will >> have some recorded pointers to become dangling, can't do the >> validity check on itself, seems to need a side linear search to >> ensure the validity. >> >>> There's still the to_visit reserve which needs a bound on >>> the number of loops for efficiency reasons. >>> >> >> Yes, I still keep the fn in the updated version. >> >>>> So I add one optional argument loop_p root and update the >>>> visiting codes accordingly. Before this change, the previous >>>> visiting uses the outer_loop == NULL as the termination condition, >>>> it perfectly includes the root itself, but with this given root, >>>> we have to use it as the termination condition to avoid to iterate >>>> onto its possible existing next. >>>> >>>> For LI_ONLY_INNERMOST, I was thinking whether we can use the >>>> code like: >>>> >>>> struct loops *fn_loops = loops_for_fn (fn)->larray; >>>> for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++) >>>> if (aloop != NULL >>>> && aloop->inner == NULL >>>> && flow_loop_nested_p (tree_root, aloop)) >>>> this->to_visit.quick_push (aloop->num); >>>> >>>> it has the stable bound, but if the given root only has several >>>> child loops, it can be much worse if there are many loops in fn. >>>> It seems impossible to predict the given root loop hierarchy size, >>>> maybe we can still use the original linear searching for the case >>>> loops_for_fn (fn) == root? But since this visiting seems not so >>>> performance critical, I chose to share the code originally used >>>> for FROM_INNERMOST, hope it can have better readability and >>>> maintainability. >>> >>> I was indeed looking for something that has execution/storage >>> bound on the subtree we're interested in. If we pull the CTOR >>> out-of-line we can probably keep the linear search for >>> LI_ONLY_INNERMOST when looking at the whole loop tree. >>> >> >> OK, I've moved the suggested single loop tree walker out-of-line >> to cfgloop.c, and brought the linear search back for >> LI_ONLY_INNERMOST when looking at the whole loop tree. >> >>> It just seemed to me that we can eventually re-use a >>> single loop tree walker for all orders, just adjusting the >>> places we push. >>> >> >> Wow, good point! Indeed, I have further unified all orders >> handlings into a single function walk_loop_tree. >> >>>> >>>> Bootstrapped and regtested on powerpc64le-linux-gnu P9, >>>> x86_64-redhat-linux and aarch64-linux-gnu, also >>>> bootstrapped on ppc64le P9 with bootstrap-O3 config. >>>> >>>> Does the attached patch meet what you expect? >>> >>> So yeah, it's probably close to what is sensible. Not sure >>> whether optimizing the loops for the !only_push_innermost_p >>> case is important - if we manage to produce a single >>> walker with conditionals based on 'flags' then IPA-CP should >>> produce optimal clones as well I guess. >>> >> >> Thanks for the comments, the updated v2 is attached. >> Comparing with v1, it does: >> >> - Unify one single loop tree walker for all orders. >> - Move walk_loop_tree out-of-line to cfgloop.c. >> - Keep the linear search for LI_ONLY_INNERMOST with >> tree_root of fn loops. >> - Use class loop * instead of loop_p. >> >> Bootstrapped & regtested on powerpc64le-linux-gnu Power9 >> (with/without the hunk for LI_ONLY_INNERMOST linear search, >> it can have the coverage to exercise LI_ONLY_INNERMOST >> in walk_loop_tree when "without"). >> >> Is it ok for trunk? > > Looks good to me. I think that the 'mn' was an optimization > for the linear walk and it's cheaper to pointer test against > the actual 'root' loop (no need to dereference). Thus > > + if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root) > { > - for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++) > + class loop *aloop; > + unsigned int i; > + for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++) > if (aloop != NULL > && aloop->inner == NULL > - && aloop->num >= mn) > + && aloop->num != mn) > this->to_visit.quick_push (aloop->num); > > could elide the aloop->num != mn check and start iterating from 1, > since loops->tree_root->num == 0 > > and the walk_loop_tree could simply do > > class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root; > > and pointer test aloop against exclude. That avoids the idea that > 'mn' is a vehicle to exclude one random loop from the iteration. > Good idea! Thanks for the comments! The attached v3 has addressed the review comments on "mn". Bootstrapped & regtested again on powerpc64le-linux-gnu Power9 (with/without the hunk for LI_ONLY_INNERMOST linear search). Is it ok for trunk? BR, Kewen ----- gcc/ChangeLog: * cfgloop.h (loops_list::loops_list): Add one optional argument root and adjust accordingly, update loop tree walking and factor out to ... * cfgloop.c (loops_list::walk_loop_tree): ...this. New function. --------------2202FFB7F0F83BB6766C60F6 Content-Type: text/plain; charset=UTF-8; x-mac-type="0"; x-mac-creator="0"; name="loop_root-v3.diff" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="loop_root-v3.diff" LS0tCiBnY2MvY2ZnbG9vcC5jIHwgIDY1ICsrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrCiBnY2MvY2ZnbG9vcC5oIHwgMTAwICsrKysrKysrKysrKysrKysrKysrLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tCiAyIGZpbGVzIGNoYW5nZWQsIDEwNSBpbnNlcnRpb25z KCspLCA2MCBkZWxldGlvbnMoLSkKCmRpZmYgLS1naXQgYS9nY2MvY2ZnbG9vcC5jIGIvZ2Nj L2NmZ2xvb3AuYwppbmRleCA2Mjg0YWUyOTJiNi4uYWZiYWEyMTZjZTUgMTAwNjQ0Ci0tLSBh L2djYy9jZmdsb29wLmMKKysrIGIvZ2NjL2NmZ2xvb3AuYwpAQCAtMjEwNCwzICsyMTA0LDY4 IEBAIG1hcmtfbG9vcF9mb3JfcmVtb3ZhbCAobG9vcF9wIGxvb3ApCiAgIGxvb3AtPmxhdGNo ID0gTlVMTDsKICAgbG9vcHNfc3RhdGVfc2V0IChMT09QU19ORUVEX0ZJWFVQKTsKIH0KKwor LyogU3RhcnRpbmcgZnJvbSBsb29wIHRyZWUgUk9PVCwgd2FsayBsb29wIHRyZWUgYXMgdGhl IHZpc2l0aW5nCisgICBvcmRlciBzcGVjaWZpZWQgYnkgRkxBR1MuICBUaGUgc3VwcG9ydGVk IHZpc2l0aW5nIG9yZGVycworICAgYXJlOgorICAgICAtIExJX09OTFlfSU5ORVJNT1NUCisg ICAgIC0gTElfRlJPTV9JTk5FUk1PU1QKKyAgICAgLSBQcmVvcmRlciAoaWYgbmVpdGhlciBv ZiBhYm92ZSBpcyBzcGVjaWZpZWQpICAqLworCit2b2lkCitsb29wc19saXN0Ojp3YWxrX2xv b3BfdHJlZSAoY2xhc3MgbG9vcCAqcm9vdCwgdW5zaWduZWQgZmxhZ3MpCit7CisgIGJvb2wg b25seV9pbm5lcm1vc3RfcCA9IGZsYWdzICYgTElfT05MWV9JTk5FUk1PU1Q7CisgIGJvb2wg ZnJvbV9pbm5lcm1vc3RfcCA9IGZsYWdzICYgTElfRlJPTV9JTk5FUk1PU1Q7CisgIGJvb2wg cHJlb3JkZXJfcCA9ICEob25seV9pbm5lcm1vc3RfcCB8fCBmcm9tX2lubmVybW9zdF9wKTsK KyAgY2xhc3MgbG9vcCAqZXhjbHVkZSA9IGZsYWdzICYgTElfSU5DTFVERV9ST09UID8gTlVM TCA6IHJvb3Q7CisKKyAgLyogRWFybHkgaGFuZGxlIHJvb3Qgd2l0aG91dCBhbnkgaW5uZXIg bG9vcHMsIG1ha2UgbGF0ZXIKKyAgICAgcHJvY2Vzc2luZyBzaW1wbGVyLCB0aGF0IGlzIGFs bCBsb29wcyBwcm9jZXNzZWQgaW4gdGhlCisgICAgIGZvbGxvd2luZyB3aGlsZSBsb29wIGFy ZSBpbXBvc3NpYmxlIHRvIGJlIHJvb3QuICAqLworICBpZiAoIXJvb3QtPmlubmVyKQorICAg IHsKKyAgICAgIGlmIChyb290ICE9IGV4Y2x1ZGUpCisJdGhpcy0+dG9fdmlzaXQucXVpY2tf cHVzaCAocm9vdC0+bnVtKTsKKyAgICAgIHJldHVybjsKKyAgICB9CisKKyAgY2xhc3MgbG9v cCAqYWxvb3A7CisgIGZvciAoYWxvb3AgPSByb290OworICAgICAgIGFsb29wLT5pbm5lciAh PSBOVUxMOworICAgICAgIGFsb29wID0gYWxvb3AtPmlubmVyKQorICAgIHsKKyAgICAgIGlm IChwcmVvcmRlcl9wICYmIGFsb29wICE9IGV4Y2x1ZGUpCisJdGhpcy0+dG9fdmlzaXQucXVp Y2tfcHVzaCAoYWxvb3AtPm51bSk7CisgICAgICBjb250aW51ZTsKKyAgICB9CisKKyAgd2hp bGUgKDEpCisgICAgeworICAgICAgZ2NjX2Fzc2VydCAoYWxvb3AgIT0gcm9vdCk7CisgICAg ICBpZiAoZnJvbV9pbm5lcm1vc3RfcCB8fCBhbG9vcC0+aW5uZXIgPT0gTlVMTCkKKwl0aGlz LT50b192aXNpdC5xdWlja19wdXNoIChhbG9vcC0+bnVtKTsKKworICAgICAgaWYgKGFsb29w LT5uZXh0KQorCXsKKwkgIGZvciAoYWxvb3AgPSBhbG9vcC0+bmV4dDsKKwkgICAgICAgYWxv b3AtPmlubmVyICE9IE5VTEw7CisJICAgICAgIGFsb29wID0gYWxvb3AtPmlubmVyKQorCSAg ICB7CisJICAgICAgaWYgKHByZW9yZGVyX3ApCisJCXRoaXMtPnRvX3Zpc2l0LnF1aWNrX3B1 c2ggKGFsb29wLT5udW0pOworCSAgICAgIGNvbnRpbnVlOworCSAgICB9CisJfQorICAgICAg ZWxzZSBpZiAobG9vcF9vdXRlciAoYWxvb3ApID09IHJvb3QpCisJYnJlYWs7CisgICAgICBl bHNlCisJYWxvb3AgPSBsb29wX291dGVyIChhbG9vcCk7CisgICAgfQorCisgIC8qIFdoZW4g dmlzaXRpbmcgZnJvbSBpbm5lcm1vc3QsIHdlIG5lZWQgdG8gY29uc2lkZXIgcm9vdCBoZXJl CisgICAgIHNpbmNlIHRoZSBwcmV2aW91cyB3aGlsZSBsb29wIGRvZXNuJ3QgaGFuZGxlIGl0 LiAgKi8KKyAgaWYgKGZyb21faW5uZXJtb3N0X3AgJiYgcm9vdCAhPSBleGNsdWRlKQorICAg IHRoaXMtPnRvX3Zpc2l0LnF1aWNrX3B1c2ggKHJvb3QtPm51bSk7Cit9CisKZGlmZiAtLWdp dCBhL2djYy9jZmdsb29wLmggYi9nY2MvY2ZnbG9vcC5oCmluZGV4IGQ1ZWVlNmI0ODQwLi5m ZWQyYjBkYWY0YiAxMDA2NDQKLS0tIGEvZ2NjL2NmZ2xvb3AuaAorKysgYi9nY2MvY2ZnbG9v cC5oCkBAIC02NjksMTMgKzY2OSwxNSBAQCBhc19jb25zdCAoVCAmdCkKIH0KIAogLyogQSBs aXN0IGZvciB2aXNpdGluZyBsb29wcywgd2hpY2ggY29udGFpbnMgdGhlIGxvb3AgbnVtYmVy cyBpbnN0ZWFkIG9mCi0gICB0aGUgbG9vcCBwb2ludGVycy4gIFRoZSBzY29wZSBpcyByZXN0 cmljdGVkIGluIGZ1bmN0aW9uIEZOIGFuZCB0aGUKLSAgIHZpc2l0aW5nIG9yZGVyIGlzIHNw ZWNpZmllZCBieSBGTEFHUy4gICovCisgICB0aGUgbG9vcCBwb2ludGVycy4gIElmIHRoZSBs b29wIFJPT1QgaXMgb2ZmZXJlZCAobm9uLW51bGwpLCB0aGUgdmlzaXRpbmcKKyAgIHdpbGwg c3RhcnQgZnJvbSBpdCwgb3RoZXJ3aXNlIGl0IHdvdWxkIHN0YXJ0IGZyb20gdGhlIHRyZWVf cm9vdCBvZgorICAgbG9vcHNfZm9yX2ZuIChGTikgaW5zdGVhZC4gIFRoZSBzY29wZSBpcyBy ZXN0cmljdGVkIGluIGZ1bmN0aW9uIEZOIGFuZAorICAgdGhlIHZpc2l0aW5nIG9yZGVyIGlz IHNwZWNpZmllZCBieSBGTEFHUy4gICovCiAKIGNsYXNzIGxvb3BzX2xpc3QKIHsKIHB1Ymxp YzoKLSAgbG9vcHNfbGlzdCAoZnVuY3Rpb24gKmZuLCB1bnNpZ25lZCBmbGFncyk7CisgIGxv b3BzX2xpc3QgKGZ1bmN0aW9uICpmbiwgdW5zaWduZWQgZmxhZ3MsIGNsYXNzIGxvb3AgKnJv b3QgPSBudWxscHRyKTsKIAogICB0ZW1wbGF0ZSA8dHlwZW5hbWUgVD4gY2xhc3MgSXRlcgog ICB7CkBAIC03NTAsNiArNzUyLDEwIEBAIHB1YmxpYzoKICAgfQogCiBwcml2YXRlOgorICAv KiBXYWxrIGxvb3AgdHJlZSBzdGFydGluZyBmcm9tIFJPT1QgYXMgdGhlIHZpc2l0aW5nIG9y ZGVyIHNwZWNpZmllZAorICAgICBieSBGTEFHUy4gICovCisgIHZvaWQgd2Fsa19sb29wX3Ry ZWUgKGNsYXNzIGxvb3AgKnJvb3QsIHVuc2lnbmVkIGZsYWdzKTsKKwogICAvKiBUaGUgZnVu Y3Rpb24gd2UgYXJlIHZpc2l0aW5nLiAgKi8KICAgZnVuY3Rpb24gKmZuOwogCkBAIC03ODIs NzYgKzc4OCw1MCBAQCBsb29wc19saXN0OjpJdGVyPFQ+OjpmaWxsX2N1cnJfbG9vcCAoKQog fQogCiAvKiBTZXQgdXAgdGhlIGxvb3BzIGxpc3QgdG8gdmlzaXQgYWNjb3JkaW5nIHRvIHRo ZSBzcGVjaWZpZWQKLSAgIGZ1bmN0aW9uIHNjb3BlIEZOIGFuZCBpdGVyYXRpbmcgb3JkZXIg RkxBR1MuICAqLworICAgZnVuY3Rpb24gc2NvcGUgRk4gYW5kIGl0ZXJhdGluZyBvcmRlciBG TEFHUy4gIElmIFJPT1QgaXMKKyAgIG5vdCBudWxsLCB0aGUgdmlzaXRpbmcgd291bGQgc3Rh cnQgZnJvbSBpdCwgb3RoZXJ3aXNlIGl0CisgICB3aWxsIHN0YXJ0IGZyb20gdHJlZV9yb290 IG9mIGxvb3BzX2Zvcl9mbiAoRk4pLiAgKi8KIAotaW5saW5lIGxvb3BzX2xpc3Q6Omxvb3Bz X2xpc3QgKGZ1bmN0aW9uICpmbiwgdW5zaWduZWQgZmxhZ3MpCitpbmxpbmUgbG9vcHNfbGlz dDo6bG9vcHNfbGlzdCAoZnVuY3Rpb24gKmZuLCB1bnNpZ25lZCBmbGFncywgY2xhc3MgbG9v cCAqcm9vdCkKIHsKLSAgY2xhc3MgbG9vcCAqYWxvb3A7Ci0gIHVuc2lnbmVkIGk7Ci0gIGlu dCBtbjsKKyAgc3RydWN0IGxvb3BzICpsb29wcyA9IGxvb3BzX2Zvcl9mbiAoZm4pOworICBn Y2NfYXNzZXJ0ICghcm9vdCB8fCBsb29wcyk7CisKKyAgLyogQ2hlY2sgbXV0dWFsbHkgZXhj bHVzaXZlIGZsYWdzIHNob3VsZCBub3QgY28tZXhpc3QuICAqLworICB1bnNpZ25lZCBjaGVj a2VkX2ZsYWdzID0gTElfT05MWV9JTk5FUk1PU1QgfCBMSV9GUk9NX0lOTkVSTU9TVDsKKyAg Z2NjX2Fzc2VydCAoKGZsYWdzICYgY2hlY2tlZF9mbGFncykgIT0gY2hlY2tlZF9mbGFncyk7 CiAKICAgdGhpcy0+Zm4gPSBmbjsKLSAgaWYgKCFsb29wc19mb3JfZm4gKGZuKSkKKyAgaWYg KCFsb29wcykKICAgICByZXR1cm47CiAKKyAgY2xhc3MgbG9vcCAqdHJlZV9yb290ID0gcm9v dCA/IHJvb3QgOiBsb29wcy0+dHJlZV9yb290OworCiAgIHRoaXMtPnRvX3Zpc2l0LnJlc2Vy dmVfZXhhY3QgKG51bWJlcl9vZl9sb29wcyAoZm4pKTsKLSAgbW4gPSAoZmxhZ3MgJiBMSV9J TkNMVURFX1JPT1QpID8gMCA6IDE7CiAKLSAgaWYgKGZsYWdzICYgTElfT05MWV9JTk5FUk1P U1QpCisgIC8qIFdoZW4gcm9vdCBpcyB0cmVlX3Jvb3Qgb2YgbG9vcHNfZm9yX2ZuIChmbikg YW5kIHRoZSB2aXNpdGluZworICAgICBvcmRlciBpcyBMSV9PTkxZX0lOTkVSTU9TVCwgd2Ug d291bGQgbGlrZSB0byB1c2UgbGluZWFyCisgICAgIHNlYXJjaCBoZXJlIHNpbmNlIGl0IGhh cyBhIG1vcmUgc3RhYmxlIGJvdW5kIHRoYW4gdGhlCisgICAgIHdhbGtfbG9vcF90cmVlLiAg Ki8KKyAgaWYgKGZsYWdzICYgTElfT05MWV9JTk5FUk1PU1QgJiYgdHJlZV9yb290ID09IGxv b3BzLT50cmVlX3Jvb3QpCiAgICAgewotICAgICAgZm9yIChpID0gMDsgdmVjX3NhZmVfaXRl cmF0ZSAobG9vcHNfZm9yX2ZuIChmbiktPmxhcnJheSwgaSwgJmFsb29wKTsgaSsrKQotCWlm IChhbG9vcCAhPSBOVUxMCi0JICAgICYmIGFsb29wLT5pbm5lciA9PSBOVUxMCi0JICAgICYm IGFsb29wLT5udW0gPj0gbW4pCi0JICB0aGlzLT50b192aXNpdC5xdWlja19wdXNoIChhbG9v cC0+bnVtKTsKLSAgICB9Ci0gIGVsc2UgaWYgKGZsYWdzICYgTElfRlJPTV9JTk5FUk1PU1Qp Ci0gICAgewotICAgICAgLyogUHVzaCB0aGUgbG9vcHMgdG8gTEktPlRPX1ZJU0lUIGluIHBv c3RvcmRlci4gICovCi0gICAgICBmb3IgKGFsb29wID0gbG9vcHNfZm9yX2ZuIChmbiktPnRy ZWVfcm9vdDsKLQkgICBhbG9vcC0+aW5uZXIgIT0gTlVMTDsKLQkgICBhbG9vcCA9IGFsb29w LT5pbm5lcikKLQljb250aW51ZTsKLQotICAgICAgd2hpbGUgKDEpCisgICAgICBnY2NfYXNz ZXJ0ICh0cmVlX3Jvb3QtPm51bSA9PSAwKTsKKyAgICAgIGlmICh0cmVlX3Jvb3QtPmlubmVy ID09IE5VTEwpCiAJewotCSAgaWYgKGFsb29wLT5udW0gPj0gbW4pCi0JICAgIHRoaXMtPnRv X3Zpc2l0LnF1aWNrX3B1c2ggKGFsb29wLT5udW0pOwotCi0JICBpZiAoYWxvb3AtPm5leHQp Ci0JICAgIHsKLQkgICAgICBmb3IgKGFsb29wID0gYWxvb3AtPm5leHQ7Ci0JCSAgIGFsb29w LT5pbm5lciAhPSBOVUxMOwotCQkgICBhbG9vcCA9IGFsb29wLT5pbm5lcikKLQkJY29udGlu dWU7Ci0JICAgIH0KLQkgIGVsc2UgaWYgKCFsb29wX291dGVyIChhbG9vcCkpCi0JICAgIGJy ZWFrOwotCSAgZWxzZQotCSAgICBhbG9vcCA9IGxvb3Bfb3V0ZXIgKGFsb29wKTsKKwkgIGlm IChmbGFncyAmIExJX0lOQ0xVREVfUk9PVCkKKwkgICAgdGhpcy0+dG9fdmlzaXQucXVpY2tf cHVzaCAoMCk7CisKKwkgIHJldHVybjsKIAl9CisKKyAgICAgIGNsYXNzIGxvb3AgKmFsb29w OworICAgICAgdW5zaWduZWQgaW50IGk7CisgICAgICBmb3IgKGkgPSAxOyB2ZWNfc2FmZV9p dGVyYXRlIChsb29wcy0+bGFycmF5LCBpLCAmYWxvb3ApOyBpKyspCisJaWYgKGFsb29wICE9 IE5VTEwgJiYgYWxvb3AtPmlubmVyID09IE5VTEwpCisJICB0aGlzLT50b192aXNpdC5xdWlj a19wdXNoIChhbG9vcC0+bnVtKTsKICAgICB9CiAgIGVsc2UKLSAgICB7Ci0gICAgICAvKiBQ dXNoIHRoZSBsb29wcyB0byBMSS0+VE9fVklTSVQgaW4gcHJlb3JkZXIuICAqLwotICAgICAg YWxvb3AgPSBsb29wc19mb3JfZm4gKGZuKS0+dHJlZV9yb290OwotICAgICAgd2hpbGUgKDEp Ci0JewotCSAgaWYgKGFsb29wLT5udW0gPj0gbW4pCi0JICAgIHRoaXMtPnRvX3Zpc2l0LnF1 aWNrX3B1c2ggKGFsb29wLT5udW0pOwotCi0JICBpZiAoYWxvb3AtPmlubmVyICE9IE5VTEwp Ci0JICAgIGFsb29wID0gYWxvb3AtPmlubmVyOwotCSAgZWxzZQotCSAgICB7Ci0JICAgICAg d2hpbGUgKGFsb29wICE9IE5VTEwgJiYgYWxvb3AtPm5leHQgPT0gTlVMTCkKLQkJYWxvb3Ag PSBsb29wX291dGVyIChhbG9vcCk7Ci0JICAgICAgaWYgKGFsb29wID09IE5VTEwpCi0JCWJy ZWFrOwotCSAgICAgIGFsb29wID0gYWxvb3AtPm5leHQ7Ci0JICAgIH0KLQl9Ci0gICAgfQor ICAgIHdhbGtfbG9vcF90cmVlICh0cmVlX3Jvb3QsIGZsYWdzKTsKIH0KIAogLyogVGhlIHBy b3BlcnRpZXMgb2YgdGhlIHRhcmdldC4gICovCi0tIAoyLjI3LjAKCg== --------------2202FFB7F0F83BB6766C60F6--