From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by sourceware.org (Postfix) with ESMTPS id C3FA73858407 for ; Wed, 21 Sep 2022 07:45:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C3FA73858407 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pf1-x42d.google.com with SMTP id a29so5110916pfk.5 for ; Wed, 21 Sep 2022 00:45:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date; bh=LMeaakuCMH5i6w1zJ8loJn5d4jlIUlfhXuB+KO492hM=; b=Ejf9C4zRdhWeqzNGiXjkqQip5bvmyfhH8MksoJNJxwqrj7gO4PxTMyPaVOXVi+G3yZ xoSnrbzjRNDObXVwyGxTNpbl3a6Nr0Vn85RnYPkJHUPuDbGhfqHU6YWapbnZLCVstKyq sMEwoE11cIEuX8xIHuYNrmIPegzJplrAZx47toPZBXPDQMJHSZhBcID2/RxFyikR7O3X zpQOxDulnMB0TF+AYfcsbpr6WgJYPM/80zDfkvmgD79fkVDSY1q5Og3Y6e52o7pYdgvM j5rn45jEgugG+BW/iVO/0OCaC+oslanziNwF0HfZhqPHDcQIYy5m44aHzCPA+SVvk4RD 1aCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date; bh=LMeaakuCMH5i6w1zJ8loJn5d4jlIUlfhXuB+KO492hM=; b=QSRadetn8HqnOhv2qfqY6Irc1TR4a6I4eCVDz5GvvoVt31+V2vNtwiqDSHwxWcqgly FzgyFbCGarlemHFfeEKJEKJDUTk3R9RvNVzF0BZ5RR0c2zynUy3T4EB96/0UOHvR0MtT O9gS86rSiPfedWKJ5CKJ5vXKAG03+6fhqo3OqAD32VdeWrw52s0UB2Py7jVbxLorghuG 5+Z50UwCnkQ+xi2uWVDm31Pul9o091o4Yd2Wh/Tw3vsRUSFkeuqYweT3DnXY9ftDTJpn RfUyFdafaFlkjZwZ/hpIxc3+tesdG5h7roUNmdW0WvO6B+AYZ0XYiCS/coIKe0ASrQQH 6dVA== X-Gm-Message-State: ACrzQf065myeDhU5Hy8DUceeBwmYr7M2CqM4E6s+a6ZBgAGOwqi5tEFs 73xAVYCbQM+WUxbGPuK1Hn+xx00LkdsZ5w== X-Google-Smtp-Source: AMsMyM66Fca5IbpjGrplYAnrPfZ7XSh5XtmHGO4sjI6FK9zPIbSjqHMo1RTSnveFMXrQeoDBFESrYA== X-Received: by 2002:a63:6e8e:0:b0:430:3886:3a20 with SMTP id j136-20020a636e8e000000b0043038863a20mr23633675pgc.604.1663746343037; Wed, 21 Sep 2022 00:45:43 -0700 (PDT) Received: from [192.168.50.11] (112-104-15-252.adsl.dynamic.seed.net.tw. [112.104.15.252]) by smtp.gmail.com with ESMTPSA id na18-20020a17090b4c1200b001fde265ff4bsm1245343pjb.4.2022.09.21.00.45.38 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 21 Sep 2022 00:45:39 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------UeLwDx0m0zxUTc8qrOCM4f60" Message-ID: <8b974d21-e288-4596-7500-277a43c92771@gmail.com> Date: Wed, 21 Sep 2022 15:45:36 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Content-Language: en-US From: Chung-Lin Tang Subject: [PATCH, nvptx, 1/2] Reimplement libgomp barriers for nvptx To: gcc-patches , Tom de Vries , Catherine Moore X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_BARRACUDACENTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multi-part message in MIME format. --------------UeLwDx0m0zxUTc8qrOCM4f60 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi Tom, I had a patch submitted earlier, where I reported that the current way of implementing barriers in libgomp on nvptx created a quite significant performance drop on some SPEChpc2021 benchmarks: https://gcc.gnu.org/pipermail/gcc-patches/2022-September/600818.html That previous patch wasn't accepted well (admittedly, it was kind of a hack). So in this patch, I tried to (mostly) re-implement team-barriers for NVPTX. Basically, instead of trying to have the GPU do CPU-with-OS-like things that it isn't suited for, barriers are implemented simplistically with bar.* synchronization instructions. Tasks are processed after threads have joined, and only if team->task_count != 0 (arguably, there might be a little bit of performance forfeited where earlier arriving threads could've been used to process tasks ahead of other threads. But that again falls into requiring implementing complex futex-wait/wake like behavior. Really, that kind of tasking is not what target offloading is usually used for) Implementation highlight notes: 1. gomp_team_barrier_wake() is now an empty function (threads never "wake" in the usual manner) 2. gomp_team_barrier_cancel() now uses the "exit" PTX instruction. 3. gomp_barrier_wait_last() now is implemented using "bar.arrive" 4. gomp_team_barrier_wait_end()/gomp_team_barrier_wait_cancel_end(): The main synchronization is done using a 'bar.red' instruction. This reduces across all threads the condition (team->task_count != 0), to enable the task processing down below if any thread created a task. (this bar.red usage required the need of the second GCC patch in this series) This patch has been tested on x86_64/powerpc64le with nvptx offloading, using libgomp, ovo, omptests, and sollve_vv testsuites, all without regressions. Also verified that the SPEChpc 2021 521.miniswp_t and 534.hpgmgfv_t performance regressions that occurred in the GCC12 cycle has been restored to devel/omp/gcc-11 (OG11) branch levels. Is this okay for trunk? (also suggest backporting to GCC12 branch, if performance regression can be considered a defect) Thanks, Chung-Lin libgomp/ChangeLog: 2022-09-21 Chung-Lin Tang * config/nvptx/bar.c (generation_to_barrier): Remove. (futex_wait,futex_wake,do_spin,do_wait): Remove. (GOMP_WAIT_H): Remove. (#include "../linux/bar.c"): Remove. (gomp_barrier_wait_end): New function. (gomp_barrier_wait): Likewise. (gomp_barrier_wait_last): Likewise. (gomp_team_barrier_wait_end): Likewise. (gomp_team_barrier_wait): Likewise. (gomp_team_barrier_wait_final): Likewise. (gomp_team_barrier_wait_cancel_end): Likewise. (gomp_team_barrier_wait_cancel): Likewise. (gomp_team_barrier_cancel): Likewise. * config/nvptx/bar.h (gomp_team_barrier_wake): Remove prototype, add new static inline function. --------------UeLwDx0m0zxUTc8qrOCM4f60 Content-Type: text/plain; charset=UTF-8; name="nvptx-libgomp-barrier.patch" Content-Disposition: attachment; filename="nvptx-libgomp-barrier.patch" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL2xpYmdvbXAvY29uZmlnL252cHR4L2Jhci5jIGIvbGliZ29tcC9jb25m aWcvbnZwdHgvYmFyLmMKaW5kZXggZWVlMjEwNy4uMGI5NThlZCAxMDA2NDQKLS0tIGEvbGli Z29tcC9jb25maWcvbnZwdHgvYmFyLmMKKysrIGIvbGliZ29tcC9jb25maWcvbnZwdHgvYmFy LmMKQEAgLTMwLDEzNyArMzAsMTQzIEBACiAjaW5jbHVkZSA8bGltaXRzLmg+CiAjaW5jbHVk ZSAibGliZ29tcC5oIgogCi0vKiBGb3IgY3B1X3JlbGF4LiAgKi8KLSNpbmNsdWRlICJkb2Fj cm9zcy5oIgotCi0vKiBBc3N1bWluZyBBRERSIGlzICZiYXItPmdlbmVyYXRpb24sIHJldHVy biBiYXIuICBDb3BpZWQgZnJvbQotICAgcnRlbXMvYmFyLmMuICAqLwordm9pZAorZ29tcF9i YXJyaWVyX3dhaXRfZW5kIChnb21wX2JhcnJpZXJfdCAqYmFyLCBnb21wX2JhcnJpZXJfc3Rh dGVfdCBzdGF0ZSkKK3sKKyAgaWYgKF9fYnVpbHRpbl9leHBlY3QgKHN0YXRlICYgQkFSX1dB U19MQVNULCAwKSkKKyAgICB7CisgICAgICAvKiBOZXh0IHRpbWUgd2UnbGwgYmUgYXdhaXRp bmcgVE9UQUwgdGhyZWFkcyBhZ2Fpbi4gICovCisgICAgICBiYXItPmF3YWl0ZWQgPSBiYXIt PnRvdGFsOworICAgICAgX19hdG9taWNfc3RvcmVfbiAoJmJhci0+Z2VuZXJhdGlvbiwgYmFy LT5nZW5lcmF0aW9uICsgQkFSX0lOQ1IsCisJCQlNRU1NT0RFTF9SRUxFQVNFKTsKKyAgICB9 CisgIGlmIChiYXItPnRvdGFsID4gMSkKKyAgICBhc20gKCJiYXIuc3luYyAxLCAlMDsiIDog OiAiciIgKDMyICogYmFyLT50b3RhbCkpOworfQogCi1zdGF0aWMgZ29tcF9iYXJyaWVyX3Qg KgotZ2VuZXJhdGlvbl90b19iYXJyaWVyIChpbnQgKmFkZHIpCit2b2lkCitnb21wX2JhcnJp ZXJfd2FpdCAoZ29tcF9iYXJyaWVyX3QgKmJhcikKIHsKLSAgY2hhciAqYmFyCi0gICAgPSAo Y2hhciAqKSBhZGRyIC0gX19idWlsdGluX29mZnNldG9mIChnb21wX2JhcnJpZXJfdCwgZ2Vu ZXJhdGlvbik7Ci0gIHJldHVybiAoZ29tcF9iYXJyaWVyX3QgKiliYXI7CisgIGdvbXBfYmFy cmllcl93YWl0X2VuZCAoYmFyLCBnb21wX2JhcnJpZXJfd2FpdF9zdGFydCAoYmFyKSk7CiB9 CiAKLS8qIEltcGxlbWVudCBmdXRleF93YWl0LWxpa2UgYmVoYXZpb3VyIHRvIHBsdWcgaW50 byB0aGUgbGludXgvYmFyLmMKLSAgIGltcGxlbWVudGF0aW9uLiAgQXNzdW1lcyBBRERSIGlz ICZiYXItPmdlbmVyYXRpb24uICAgKi8KKy8qIExpa2UgZ29tcF9iYXJyaWVyX3dhaXQsIGV4 Y2VwdCB0aGF0IGlmIHRoZSBlbmNvdW50ZXJpbmcgdGhyZWFkCisgICBpcyBub3QgdGhlIGxh c3Qgb25lIHRvIGhpdCB0aGUgYmFycmllciwgaXQgcmV0dXJucyBpbW1lZGlhdGVseS4KKyAg IFRoZSBpbnRlbmRlZCB1c2FnZSBpcyB0aGF0IGEgdGhyZWFkIHdoaWNoIGludGVuZHMgdG8g Z29tcF9iYXJyaWVyX2Rlc3Ryb3kKKyAgIHRoaXMgYmFycmllciBjYWxscyBnb21wX2JhcnJp ZXJfd2FpdCwgd2hpbGUgYWxsIG90aGVyIHRocmVhZHMKKyAgIGNhbGwgZ29tcF9iYXJyaWVy X3dhaXRfbGFzdC4gIFdoZW4gZ29tcF9iYXJyaWVyX3dhaXQgcmV0dXJucywKKyAgIHRoZSBi YXJyaWVyIGNhbiBiZSBzYWZlbHkgZGVzdHJveWVkLiAgKi8KIAotc3RhdGljIGlubGluZSB2 b2lkCi1mdXRleF93YWl0IChpbnQgKmFkZHIsIGludCB2YWwpCit2b2lkCitnb21wX2JhcnJp ZXJfd2FpdF9sYXN0IChnb21wX2JhcnJpZXJfdCAqYmFyKQogewotICBnb21wX2JhcnJpZXJf dCAqYmFyID0gZ2VuZXJhdGlvbl90b19iYXJyaWVyIChhZGRyKTsKKyAgLyogVGhlIGFib3Zl IGRlc2NyaWJlZCBiZWhhdmlvciBtYXRjaGVzICdiYXIuYXJyaXZlJyBwZXJmZWN0bHkuICAq LworICBpZiAoYmFyLT50b3RhbCA+IDEpCisgICAgYXNtICgiYmFyLmFycml2ZSAxLCAlMDsi IDogOiAiciIgKDMyICogYmFyLT50b3RhbCkpOworfQogCi0gIGlmIChiYXItPnRvdGFsIDwg MikKLSAgICAvKiBBIGJhcnJpZXIgd2l0aCBsZXNzIHRoYW4gdHdvIHRocmVhZHMsIG5vcC4g ICovCi0gICAgcmV0dXJuOwordm9pZAorZ29tcF90ZWFtX2JhcnJpZXJfd2FpdF9lbmQgKGdv bXBfYmFycmllcl90ICpiYXIsIGdvbXBfYmFycmllcl9zdGF0ZV90IHN0YXRlKQoreworICBz dHJ1Y3QgZ29tcF90aHJlYWQgKnRociA9IGdvbXBfdGhyZWFkICgpOworICBzdHJ1Y3QgZ29t cF90ZWFtICp0ZWFtID0gdGhyLT50cy50ZWFtOwogCi0gIGdvbXBfbXV0ZXhfbG9jayAoJmJh ci0+bG9jayk7CisgIGJvb2wgcnVuX3Rhc2tzID0gKHRlYW0tPnRhc2tfY291bnQgIT0gMCk7 CisgIGlmIChiYXItPnRvdGFsID4gMSkKKyAgICBydW5fdGFza3MgPSBfX2J1aWx0aW5fbnZw dHhfYmFyX3JlZF9vciAoMSwgMzIgKiBiYXItPnRvdGFsLCB0cnVlLAorCQkJCQkgICAgKHRl YW0tPnRhc2tfY291bnQgIT0gMCkpOwogCi0gIC8qIEZ1dGV4IHNlbWFudGljczogb25seSBn byB0byBzbGVlcCBpZiAqYWRkciA9PSB2YWwuICAqLwotICBpZiAoX19idWlsdGluX2V4cGVj dCAoX19hdG9taWNfbG9hZF9uIChhZGRyLCBNRU1NT0RFTF9BQ1FVSVJFKSAhPSB2YWwsIDAp KQorICBpZiAoX19idWlsdGluX2V4cGVjdCAoc3RhdGUgJiBCQVJfV0FTX0xBU1QsIDApKQog ICAgIHsKLSAgICAgIGdvbXBfbXV0ZXhfdW5sb2NrICgmYmFyLT5sb2NrKTsKLSAgICAgIHJl dHVybjsKKyAgICAgIC8qIE5leHQgdGltZSB3ZSdsbCBiZSBhd2FpdGluZyBUT1RBTCB0aHJl YWRzIGFnYWluLiAgKi8KKyAgICAgIGJhci0+YXdhaXRlZCA9IGJhci0+dG90YWw7CisgICAg ICB0ZWFtLT53b3JrX3NoYXJlX2NhbmNlbGxlZCA9IDA7CiAgICAgfQogCi0gIC8qIFJlZ2lz dGVyIGFzIHdhaXRlci4gICovCi0gIHVuc2lnbmVkIGludCB3YWl0ZXJzCi0gICAgPSBfX2F0 b21pY19hZGRfZmV0Y2ggKCZiYXItPndhaXRlcnMsIDEsIE1FTU1PREVMX0FDUV9SRUwpOwot ICBpZiAod2FpdGVycyA9PSAwKQotICAgIF9fYnVpbHRpbl9hYm9ydCAoKTsKLSAgdW5zaWdu ZWQgaW50IHdhaXRlcl9pZCA9IHdhaXRlcnM7Ci0KLSAgaWYgKHdhaXRlcnMgPiAxKQorICBp ZiAoX19idWlsdGluX2V4cGVjdCAocnVuX3Rhc2tzID09IHRydWUsIDApKQogICAgIHsKLSAg ICAgIC8qIFdha2Ugb3RoZXIgdGhyZWFkcyBpbiBiYXIuc3luYy4gICovCi0gICAgICBhc20g dm9sYXRpbGUgKCJiYXIuc3luYyAxLCAlMDsiIDogOiAiciIgKDMyICogd2FpdGVycykpOwor ICAgICAgd2hpbGUgKF9fYXRvbWljX2xvYWRfbiAoJmJhci0+Z2VuZXJhdGlvbiwgTUVNTU9E RUxfQUNRVUlSRSkKKwkgICAgICYgQkFSX1RBU0tfUEVORElORykKKwlnb21wX2JhcnJpZXJf aGFuZGxlX3Rhc2tzIChzdGF0ZSk7CiAKLSAgICAgIC8qIEVuc3VyZSB0aGF0IHRoZXkgaGF2 ZSB1cGRhdGVkIHdhaXRlcnMuICAqLwotICAgICAgYXNtIHZvbGF0aWxlICgiYmFyLnN5bmMg MSwgJTA7IiA6IDogInIiICgzMiAqIHdhaXRlcnMpKTsKKyAgICAgIGlmIChiYXItPnRvdGFs ID4gMSkKKwlhc20gdm9sYXRpbGUgKCJiYXIuc3luYyAxLCAlMDsiIDogOiAiciIgKDMyICog YmFyLT50b3RhbCkpOwogICAgIH0KK30KIAotICBnb21wX211dGV4X3VubG9jayAoJmJhci0+ bG9jayk7Ci0KLSAgd2hpbGUgKDEpCi0gICAgewotICAgICAgLyogV2FpdCBmb3IgbmV4dCB0 aHJlYWQgaW4gYmFycmllci4gICovCi0gICAgICBhc20gdm9sYXRpbGUgKCJiYXIuc3luYyAx LCAlMDsiIDogOiAiciIgKDMyICogKHdhaXRlcnMgKyAxKSkpOwotCi0gICAgICAvKiBHZXQg dXBkYXRlZCB3YWl0ZXJzLiAgKi8KLSAgICAgIHVuc2lnbmVkIGludCB1cGRhdGVkX3dhaXRl cnMKLQk9IF9fYXRvbWljX2xvYWRfbiAoJmJhci0+d2FpdGVycywgTUVNTU9ERUxfQUNRVUlS RSk7Ci0KLSAgICAgIC8qIE5vdGlmeSB0aGF0IHdlIGhhdmUgdXBkYXRlZCB3YWl0ZXJzLiAg Ki8KLSAgICAgIGFzbSB2b2xhdGlsZSAoImJhci5zeW5jIDEsICUwOyIgOiA6ICJyIiAoMzIg KiAod2FpdGVycyArIDEpKSk7Ci0KLSAgICAgIHdhaXRlcnMgPSB1cGRhdGVkX3dhaXRlcnM7 Ci0KLSAgICAgIGlmICh3YWl0ZXJfaWQgPiB3YWl0ZXJzKQotCS8qIEEgd2FrZSBoYXBwZW5l ZCwgYW5kIHdlJ3JlIGluIHRoZSBncm91cCBvZiB3b2tlbiB0aHJlYWRzLiAgKi8KLQlicmVh azsKLQotICAgICAgLyogQ29udGludWUgd2FpdGluZy4gICovCi0gICAgfQordm9pZAorZ29t cF90ZWFtX2JhcnJpZXJfd2FpdCAoZ29tcF9iYXJyaWVyX3QgKmJhcikKK3sKKyAgZ29tcF90 ZWFtX2JhcnJpZXJfd2FpdF9lbmQgKGJhciwgZ29tcF9iYXJyaWVyX3dhaXRfc3RhcnQgKGJh cikpOwogfQogCi0vKiBJbXBsZW1lbnQgZnV0ZXhfd2FrZS1saWtlIGJlaGF2aW91ciB0byBw bHVnIGludG8gdGhlIGxpbnV4L2Jhci5jCi0gICBpbXBsZW1lbnRhdGlvbi4gIEFzc3VtZXMg QUREUiBpcyAmYmFyLT5nZW5lcmF0aW9uLiAgKi8KK3ZvaWQKK2dvbXBfdGVhbV9iYXJyaWVy X3dhaXRfZmluYWwgKGdvbXBfYmFycmllcl90ICpiYXIpCit7CisgIGdvbXBfYmFycmllcl9z dGF0ZV90IHN0YXRlID0gZ29tcF9iYXJyaWVyX3dhaXRfZmluYWxfc3RhcnQgKGJhcik7Cisg IGlmIChfX2J1aWx0aW5fZXhwZWN0IChzdGF0ZSAmIEJBUl9XQVNfTEFTVCwgMCkpCisgICAg YmFyLT5hd2FpdGVkX2ZpbmFsID0gYmFyLT50b3RhbDsKKyAgZ29tcF90ZWFtX2JhcnJpZXJf d2FpdF9lbmQgKGJhciwgc3RhdGUpOworfQogCi1zdGF0aWMgaW5saW5lIHZvaWQKLWZ1dGV4 X3dha2UgKGludCAqYWRkciwgaW50IGNvdW50KQorYm9vbAorZ29tcF90ZWFtX2JhcnJpZXJf d2FpdF9jYW5jZWxfZW5kIChnb21wX2JhcnJpZXJfdCAqYmFyLAorCQkJCSAgIGdvbXBfYmFy cmllcl9zdGF0ZV90IHN0YXRlKQogewotICBnb21wX2JhcnJpZXJfdCAqYmFyID0gZ2VuZXJh dGlvbl90b19iYXJyaWVyIChhZGRyKTsKKyAgc3RydWN0IGdvbXBfdGhyZWFkICp0aHIgPSBn b21wX3RocmVhZCAoKTsKKyAgc3RydWN0IGdvbXBfdGVhbSAqdGVhbSA9IHRoci0+dHMudGVh bTsKIAotICBpZiAoYmFyLT50b3RhbCA8IDIpCi0gICAgLyogQSBiYXJyaWVyIHdpdGggbGVz cyB0aGFuIHR3byB0aHJlYWRzLCBub3AuICAqLwotICAgIHJldHVybjsKKyAgYm9vbCBydW5f dGFza3MgPSAodGVhbS0+dGFza19jb3VudCAhPSAwKTsKKyAgaWYgKGJhci0+dG90YWwgPiAx KQorICAgIHJ1bl90YXNrcyA9IF9fYnVpbHRpbl9udnB0eF9iYXJfcmVkX29yICgxLCAzMiAq IGJhci0+dG90YWwsIHRydWUsCisJCQkJCSAgICAodGVhbS0+dGFza19jb3VudCAhPSAwKSk7 CisgIGlmIChzdGF0ZSAmIEJBUl9DQU5DRUxMRUQpCisgICAgcmV0dXJuIHRydWU7CiAKLSAg Z29tcF9tdXRleF9sb2NrICgmYmFyLT5sb2NrKTsKLSAgdW5zaWduZWQgaW50IHdhaXRlcnMg PSBfX2F0b21pY19sb2FkX24gKCZiYXItPndhaXRlcnMsIE1FTU1PREVMX0FDUVVJUkUpOwot ICBpZiAod2FpdGVycyA9PSAwKQorICBpZiAoX19idWlsdGluX2V4cGVjdCAoc3RhdGUgJiBC QVJfV0FTX0xBU1QsIDApKQogICAgIHsKLSAgICAgIC8qIE5vIHRocmVhZHMgdG8gd2FrZS4g ICovCi0gICAgICBnb21wX211dGV4X3VubG9jayAoJmJhci0+bG9jayk7Ci0gICAgICByZXR1 cm47CisgICAgICAvKiBOb3RlOiBCQVJfQ0FOQ0VMTEVEIHNob3VsZCBuZXZlciBiZSBzZXQg aW4gc3RhdGUgaGVyZSwgYmVjYXVzZQorCSBjYW5jZWxsYXRpb24gbWVhbnMgdGhhdCBhdCBs ZWFzdCBvbmUgb2YgdGhlIHRocmVhZHMgaGFzIGJlZW4KKwkgY2FuY2VsbGVkLCB0aHVzIG9u IGEgY2FuY2VsbGFibGUgYmFycmllciB3ZSBzaG91bGQgbmV2ZXIgc2VlCisJIGFsbCB0aHJl YWRzIHRvIGFycml2ZS4gICovCisKKyAgICAgIC8qIE5leHQgdGltZSB3ZSdsbCBiZSBhd2Fp dGluZyBUT1RBTCB0aHJlYWRzIGFnYWluLiAgKi8KKyAgICAgIGJhci0+YXdhaXRlZCA9IGJh ci0+dG90YWw7CisgICAgICB0ZWFtLT53b3JrX3NoYXJlX2NhbmNlbGxlZCA9IDA7CiAgICAg fQogCi0gIGlmIChjb3VudCA9PSBJTlRfTUFYKQotICAgIC8qIFJlbGVhc2UgYWxsIHRocmVh ZHMuICAqLwotICAgIF9fYXRvbWljX3N0b3JlX24gKCZiYXItPndhaXRlcnMsIDAsIE1FTU1P REVMX1JFTEVBU0UpOwotICBlbHNlIGlmIChjb3VudCA8IGJhci0+dG90YWwpCi0gICAgLyog UmVsZWFzZSBjb3VudCB0aHJlYWRzLiAgKi8KLSAgICBfX2F0b21pY19hZGRfZmV0Y2ggKCZi YXItPndhaXRlcnMsIC1jb3VudCwgTUVNTU9ERUxfQUNRX1JFTCk7Ci0gIGVsc2UKLSAgICAv KiBDb3VudCBoYXMgYW4gaWxsZWdhbCB2YWx1ZS4gICovCi0gICAgX19idWlsdGluX2Fib3J0 ICgpOwotCi0gIC8qIFdha2Ugb3RoZXIgdGhyZWFkcyBpbiBiYXIuc3luYy4gICovCi0gIGFz bSB2b2xhdGlsZSAoImJhci5zeW5jIDEsICUwOyIgOiA6ICJyIiAoMzIgKiAod2FpdGVycyAr IDEpKSk7CisgIGlmIChfX2J1aWx0aW5fZXhwZWN0IChydW5fdGFza3MgPT0gdHJ1ZSwgMCkp CisgICAgeworICAgICAgd2hpbGUgKF9fYXRvbWljX2xvYWRfbiAoJmJhci0+Z2VuZXJhdGlv biwgTUVNTU9ERUxfQUNRVUlSRSkKKwkgICAgICYgQkFSX1RBU0tfUEVORElORykKKwlnb21w X2JhcnJpZXJfaGFuZGxlX3Rhc2tzIChzdGF0ZSk7CiAKLSAgLyogTGV0IHRoZW0gZ2V0IHRo ZSB1cGRhdGVkIHdhaXRlcnMuICAqLwotICBhc20gdm9sYXRpbGUgKCJiYXIuc3luYyAxLCAl MDsiIDogOiAiciIgKDMyICogKHdhaXRlcnMgKyAxKSkpOworICAgICAgaWYgKGJhci0+dG90 YWwgPiAxKQorCWFzbSB2b2xhdGlsZSAoImJhci5zeW5jIDEsICUwOyIgOiA6ICJyIiAoMzIg KiBiYXItPnRvdGFsKSk7CisgICAgfQogCi0gIGdvbXBfbXV0ZXhfdW5sb2NrICgmYmFyLT5s b2NrKTsKKyAgcmV0dXJuIGZhbHNlOwogfQogCi0vKiBDb3BpZWQgZnJvbSBsaW51eC93YWl0 LmguICAqLwotCi1zdGF0aWMgaW5saW5lIGludCBkb19zcGluIChpbnQgKmFkZHIsIGludCB2 YWwpCitib29sCitnb21wX3RlYW1fYmFycmllcl93YWl0X2NhbmNlbCAoZ29tcF9iYXJyaWVy X3QgKmJhcikKIHsKLSAgLyogVGhlIGN1cnJlbnQgaW1wbGVtZW50YXRpb24gZG9lc24ndCBz cGluLiAgKi8KLSAgcmV0dXJuIDE7CisgIHJldHVybiBnb21wX3RlYW1fYmFycmllcl93YWl0 X2NhbmNlbF9lbmQgKGJhciwgZ29tcF9iYXJyaWVyX3dhaXRfc3RhcnQgKGJhcikpOwogfQog Ci0vKiBDb3BpZWQgZnJvbSBsaW51eC93YWl0LmguICAqLwotCi1zdGF0aWMgaW5saW5lIHZv aWQgZG9fd2FpdCAoaW50ICphZGRyLCBpbnQgdmFsKQordm9pZAorZ29tcF90ZWFtX2JhcnJp ZXJfY2FuY2VsIChzdHJ1Y3QgZ29tcF90ZWFtICp0ZWFtKQogewotICBpZiAoZG9fc3BpbiAo YWRkciwgdmFsKSkKLSAgICBmdXRleF93YWl0IChhZGRyLCB2YWwpOwotfQorICBnb21wX211 dGV4X2xvY2sgKCZ0ZWFtLT50YXNrX2xvY2spOworICBpZiAodGVhbS0+YmFycmllci5nZW5l cmF0aW9uICYgQkFSX0NBTkNFTExFRCkKKyAgICB7CisgICAgICBnb21wX211dGV4X3VubG9j ayAoJnRlYW0tPnRhc2tfbG9jayk7CisgICAgICByZXR1cm47CisgICAgfQorICB0ZWFtLT5i YXJyaWVyLmdlbmVyYXRpb24gfD0gQkFSX0NBTkNFTExFRDsKKyAgZ29tcF9tdXRleF91bmxv Y2sgKCZ0ZWFtLT50YXNrX2xvY2spOwogCi0vKiBSZXVzZSB0aGUgbGludXggaW1wbGVtZW50 YXRpb24uICAqLwotI2RlZmluZSBHT01QX1dBSVRfSCAxCi0jaW5jbHVkZSAiLi4vbGludXgv YmFyLmMiCisgIC8qIFRoZSAnZXhpdCcgaW5zdHJ1Y3Rpb24gY2FuY2VscyB0aGlzIHRocmVh ZCBhbmQgYWxzbyBmdWxsZmlsbHMgYW55IG90aGVyCisgICAgIENUQSB0aHJlYWRzIHdhaXRp bmcgb24gYmFycmllcnMuICAqLworICBhc20gdm9sYXRpbGUgKCJleGl0OyIpOworfQpkaWZm IC0tZ2l0IGEvbGliZ29tcC9jb25maWcvbnZwdHgvYmFyLmggYi9saWJnb21wL2NvbmZpZy9u dnB0eC9iYXIuaAppbmRleCAyOGJmN2Y0Li5kZGRhMzNlIDEwMDY0NAotLS0gYS9saWJnb21w L2NvbmZpZy9udnB0eC9iYXIuaAorKysgYi9saWJnb21wL2NvbmZpZy9udnB0eC9iYXIuaApA QCAtODMsMTAgKzgzLDE2IEBAIGV4dGVybiB2b2lkIGdvbXBfdGVhbV9iYXJyaWVyX3dhaXRf ZW5kIChnb21wX2JhcnJpZXJfdCAqLAogZXh0ZXJuIGJvb2wgZ29tcF90ZWFtX2JhcnJpZXJf d2FpdF9jYW5jZWwgKGdvbXBfYmFycmllcl90ICopOwogZXh0ZXJuIGJvb2wgZ29tcF90ZWFt X2JhcnJpZXJfd2FpdF9jYW5jZWxfZW5kIChnb21wX2JhcnJpZXJfdCAqLAogCQkJCQkgICAg ICAgZ29tcF9iYXJyaWVyX3N0YXRlX3QpOwotZXh0ZXJuIHZvaWQgZ29tcF90ZWFtX2JhcnJp ZXJfd2FrZSAoZ29tcF9iYXJyaWVyX3QgKiwgaW50KTsKIHN0cnVjdCBnb21wX3RlYW07CiBl eHRlcm4gdm9pZCBnb21wX3RlYW1fYmFycmllcl9jYW5jZWwgKHN0cnVjdCBnb21wX3RlYW0g Kik7CiAKK3N0YXRpYyBpbmxpbmUgdm9pZAorZ29tcF90ZWFtX2JhcnJpZXJfd2FrZSAoZ29t cF9iYXJyaWVyX3QgKmJhciwgaW50IGNvdW50KQoreworICAvKiBXZSBuZXZlciAid2FrZSB1 cCIgdGhyZWFkcyBvbiBudnB0eC4gIFRocmVhZHMgd2FpdCBhdCBiYXJyaWVyCisgICAgIGlu c3RydWN0aW9ucyB0aWxsIGJhcnJpZXIgZnVsbGZpbGxlZC4gIERvIG5vdGhpbmcgaGVyZS4g ICovCit9CisKIHN0YXRpYyBpbmxpbmUgZ29tcF9iYXJyaWVyX3N0YXRlX3QKIGdvbXBfYmFy cmllcl93YWl0X3N0YXJ0IChnb21wX2JhcnJpZXJfdCAqYmFyKQogewo= --------------UeLwDx0m0zxUTc8qrOCM4f60--