From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9171 invoked by alias); 4 Jan 2013 22:15:22 -0000 Received: (qmail 9152 invoked by uid 22791); 4 Jan 2013 22:15:21 -0000 X-SWARE-Spam-Status: No, hits=-4.0 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,SARE_SUB_IMPROVE X-Spam-Check-By: sourceware.org Received: from mail-la0-f41.google.com (HELO mail-la0-f41.google.com) (209.85.215.41) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 04 Jan 2013 22:15:14 +0000 Received: by mail-la0-f41.google.com with SMTP id em20so10540079lab.28 for ; Fri, 04 Jan 2013 14:15:11 -0800 (PST) MIME-Version: 1.0 Received: by 10.152.124.68 with SMTP id mg4mr50404851lab.51.1357337711784; Fri, 04 Jan 2013 14:15:11 -0800 (PST) Received: by 10.152.133.14 with HTTP; Fri, 4 Jan 2013 14:15:11 -0800 (PST) Date: Fri, 04 Jan 2013 22:15:00 -0000 Message-ID: Subject: [Patch, libfortran] Improve performance of byte swapped IO From: Janne Blomqvist To: GCC Patches , Fortran List Content-Type: multipart/mixed; boundary=f46d043bd8fe26b25604d27dd0db Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2013-01/txt/msg00205.txt.bz2 --f46d043bd8fe26b25604d27dd0db Content-Type: text/plain; charset=UTF-8 Content-length: 5653 Hi, currently byte swapped unformatted IO can be quite slow compared to the same code with no byte swapping. There are two major reasons for this: 1) The byte swapping code path resorts to transferring data element by element, leading to a lot of overhead in the IO library. 2) The function used for the actual byte swapping, reverse_memcpy , while able to handle general element sizes, is not particularly fast, especially considering that many CPU's have fast byte swapping instructions (e.g. BSWAP on x86). In order to access these fast byte swapping instructions, gcc provides the __builtin_bswap{16,32,64} builtins, falling back to libgcc code for targets that lack support. The attached patch fixes these issues. For issue (1), the read path uses in-place byte swapping of the data that has been read into the user buffer, while the write path uses a larger temporary buffer (since we are not allowed to modify the user supplied data in this case). For issue(2), the patch uses __builtin_bswap{16,32,64} where appropriate, only falling back to reverse_memcpy for other sizes. With the attached test program run on a tmpfs filesystem to avoid doing actual disk IO, I get the following: - With no byte swapping: Unformatted sequential write/read performance test Record size Write MB/s Read MB/s ========================================================== 4 52.723842817422202 72.721158943820441 8 77.508296890856386 97.237815640377221 16 110.26209495334321 143.80831184546381 32 173.94872143231535 221.89704881197937 64 282.19818562682684 373.77854583735541 128 442.22084579742244 628.80041029142183 256 636.69620860705299 966.37723642576316 512 826.05968840738080 1380.8835166612221 1024 987.18686465197561 1763.5990036057208 2048 1047.6721544191710 2058.0875622043550 4096 1115.5817147134801 2251.8731832850176 8192 1191.5021150996590 2283.8893409728184 16384 1417.6110909519391 2441.0530373866482 32768 1570.4413479046018 2543.0836384048471 65536 1673.0378706502966 2651.2182395008308 131072 1697.4944246188445 2688.2398923155783 262144 1669.6329862145872 2735.6611118973292 524288 1594.4669935231552 2697.7208298823243 - Before patch, with byte swapping: Unformatted sequential write/read performance test Record size Write MB/s Read MB/s ========================================================== 4 50.572812893689793 68.858701306591627 8 58.688513300690317 81.591733130441327 16 73.551188480607820 96.638995590227665 32 91.593767813989018 116.65817140076214 64 107.41379323761915 128.32512066346368 128 121.33499652432221 147.80777892360237 256 128.99627771476628 155.91619889220266 512 135.02742063670030 161.30042382365372 1024 137.02276709585524 164.11267056940963 2048 138.62774254302394 165.22456826188971 4096 139.27695763341924 166.34707691429571 8192 147.64584950575932 166.59526981475742 16384 147.91235479266419 166.77890398940283 32768 150.77029430529927 166.90834867503827 65536 151.59474472614465 166.84075600288520 131072 155.75202672623249 166.96550283835097 262144 155.36506626794849 166.78075976148853 524288 155.64305086921487 167.44468828946083 - After patch, with byte swapping: Unformatted sequential write/read performance test Record size Write MB/s Read MB/s ========================================================== 4 49.414771776821361 70.808060042286343 8 72.918156402459772 93.234093684373946 16 102.72461544178078 136.21700026949074 32 160.57240200649090 205.97612602315186 64 249.32082957447636 331.85515010907363 128 385.71299236810387 522.06354804855266 256 535.40608912076459 766.59668706247294 512 669.47864120368524 1006.4275938227961 1024 742.90538895500265 1187.9846039167674 2048 789.71340557340523 1333.8411634622269 4096 826.44253204731683 1395.5536995933605 8192 832.93540316116662 1361.4621716558986 16384 897.95081977010113 1469.0940087507722 32768 961.18736308033317 1533.7736812111871 65536 989.41384908496832 1564.7013916917260 131072 1003.6113762068040 1597.4063253370084 262144 980.03067664324396 1602.3188995993287 524288 985.82645661078755 1568.9537807626730 Regtested on x86_64-unknown-linux-gnu, Ok for trunk? 2013-01-04 Janne Blomqvist * io/file_pos.c (unformatted_backspace): Use __builtin_bswapXX instead of reverse_memcpy. * io/io.h (reverse_memcpy): Remove prototype. * io/transfer.c (reverse_memcpy): Make static, move towards beginning of file. (bswap_array): New function. (unformatted_read): Use bswap_array to byte swap the data in-place. (unformatted_write): Use a larger temp buffer and bswap_array. (us_read): Use __builtin_bswapXX instead of reverse_memcpy. (write_us_marker): Likewise. -- Janne Blomqvist --f46d043bd8fe26b25604d27dd0db Content-Type: application/octet-stream; name="us_perf2.f90" Content-Disposition: attachment; filename="us_perf2.f90" Content-Transfer-Encoding: base64 X-Attachment-Id: f_hbjvqmh90 Content-length: 2477 ISBUZXN0IHBlcmZvcm1hbmNlIG9mIHVuZm9ybWF0dGVkIHNlcXVlbnRpYWwg d2l0aCBkaWZmZXJlbnQgc2l6ZWQgcmVjb3Jkcy4KISBKYW5uZSBCbG9tcXZp c3QgMjAxMwpwcm9ncmFtIHVzX3BlcmYKICBpbXBsaWNpdCBub25lCiAgaW50 ZWdlciwgcGFyYW1ldGVyIDo6IGQgPSA4CiAgaW50ZWdlciwgcGFyYW1ldGVy IDo6IGk2NCA9IHNlbGVjdGVkX2ludF9raW5kKDE4KQogIGludGVnZXIgOjog aWkKICByZWFsKGQpIDo6IHdzcGVlZCwgcnNwZWVkCgogIHByaW50ICosICdV bmZvcm1hdHRlZCBzZXF1ZW50aWFsIHdyaXRlL3JlYWQgcGVyZm9ybWFuY2Ug dGVzdCcKICBwcmludCAqLCAnUmVjb3JkIHNpemUgICAgICAgICAgIFdyaXRl IE1CL3MgICAgICAgICAgICAgICAgIFJlYWQgTUIvcycKICBwcmludCAqLCAn PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PScKICBpaSA9IDEKICBkbwogICAgIGNhbGwgcnVuX3Vz X3Rlc3QgKGlpLCB3c3BlZWQsIHJzcGVlZCkKICAgICBwcmludCAqLCBpaSo0 LCB3c3BlZWQsIHJzcGVlZAogICAgIGlmIChpaSA+IDEwMDAwMCkgdGhlbgog ICAgICAgIGV4aXQKICAgICBlbmQgaWYKICAgICBpaSA9IGlpICogMgogIGVu ZCBkbwoKY29udGFpbnMKICBzdWJyb3V0aW5lIHJ1bl91c190ZXN0IChuLCB3 cywgcnMpCiAgICBpbnRlZ2VyLCBpbnRlbnQoaW4pIDo6IG4KICAgIHJlYWwo ZCksIGludGVudChvdXQpIDo6IHdzLCBycwogICAgaW50ZWdlciwgYWxsb2Nh dGFibGUgOjogZGF0YSg6KQogICAgcmVhbChkKSA6OiB0MSwgdDIKICAgIGlu dGVnZXIgOjogaWksIGxvb3BzCiAgICBpbnRlZ2VyLCBwYXJhbWV0ZXIgOjog bnNpemUgPSAxMDAwMDAwMCAhIDEwIE1CCgogICAgISBXcml0ZSBuc2l6ZSAq IGxvZyhuICsgMSkgYnl0ZXMsICBlYWNoIHJlY29yZCBpcyBuIGVsZW1lbnRz IG9mIDQgYnl0ZXMgZWFjaAogICAgISArIHR3byA0IGJ5dGUgcmVjb3JkIG1h cmtlcnMKICAgIGxvb3BzID0gbnNpemUgKiBsb2cobiArIDEuX2QpIC8gKG4q NC5fZCArIDguX2QpCgogICAgYWxsb2NhdGUoZGF0YShuKSkKICAgIGRhdGEg PSAxMjMKICAgIG9wZW4oMTAsIGZpbGU9InVzcGVyZi5kYXQiLCBmb3JtPSd1 bmZvcm1hdHRlZCcsIGFjY2Vzcz0nc2VxdWVudGlhbCcsIHN0YXR1cz0ncmVw bGFjZScpCiAgICBjYWxsIHd0aW1lKHQxKQogICAgZG8gaWkgPSAxLCBsb29w cwogICAgICAgd3JpdGUgKDEwKSBkYXRhCiAgICBlbmQgZG8KICAgIGNhbGwg d3RpbWUodDIpCiAgICBjbG9zZSgxMCkKICAgIHdzID0gbnNpemUgKiBsb2co bisxLl9kKSAvIDEwMjQqKjIgLyAodDItdDEpCiAgICBvcGVuKDEwLCBmaWxl PSJ1c3BlcmYuZGF0IiwgZm9ybT0ndW5mb3JtYXR0ZWQnLCBhY2Nlc3M9J3Nl cXVlbnRpYWwnLCBzdGF0dXM9J29sZCcpCiAgICBjYWxsIHd0aW1lKHQxKQog ICAgZG8gaWkgPSAxLCBsb29wcwogICAgICAgcmVhZCAoMTApIGRhdGEKICAg IGVuZCBkbwogICAgY2FsbCB3dGltZSh0MikKICAgIGNsb3NlKDEwLCBzdGF0 dXM9J2RlbGV0ZScpCiAgICBkZWFsbG9jYXRlKGRhdGEpCiAgICBycyA9IG5z aXplICogbG9nKG4rMS5fZCkgLyAxMDI0KioyIC8gKHQyLXQxKQogIGVuZCBz dWJyb3V0aW5lIHJ1bl91c190ZXN0CgogIHN1YnJvdXRpbmUgd3RpbWUodCkK ICAgIHJlYWwoZCkgOjogdAogICAgaW50ZWdlcihpNjQpOjogY291bnQsIHJh dGUKICAgIGNhbGwgc3lzdGVtX2Nsb2NrKGNvdW50LCByYXRlKQogICAgdCA9 IHJlYWwoY291bnQsIGQpIC8gcmF0ZQogIGVuZCBzdWJyb3V0aW5lIHd0aW1l CiAgICAKZW5kIHByb2dyYW0gdXNfcGVyZgo= --f46d043bd8fe26b25604d27dd0db Content-Type: application/octet-stream; name="bswap.diff" Content-Disposition: attachment; filename="bswap.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_hbjvqxh01 Content-length: 10330 ZGlmZiAtLWdpdCBhL2xpYmdmb3J0cmFuL2lvL2ZpbGVfcG9zLmMgYi9saWJn Zm9ydHJhbi9pby9maWxlX3Bvcy5jCmluZGV4IGM4ZWNjM2EuLmJmMjI1MGEg MTAwNjQ0Ci0tLSBhL2xpYmdmb3J0cmFuL2lvL2ZpbGVfcG9zLmMKKysrIGIv bGliZ2ZvcnRyYW4vaW8vZmlsZV9wb3MuYwpAQCAtMTQwLDE1ICsxNDAsMjEg QEAgdW5mb3JtYXR0ZWRfYmFja3NwYWNlIChzdF9wYXJhbWV0ZXJfZmlsZXBv cyAqZnBwLCBnZmNfdW5pdCAqdSkKIAl9CiAgICAgICBlbHNlCiAJeworCSAg dWludDMyX3QgdTMyOworCSAgdWludDY0X3QgdTY0OwogCSAgc3dpdGNoIChs ZW5ndGgpCiAJICAgIHsKIAkgICAgY2FzZSBzaXplb2YoR0ZDX0lOVEVHRVJf NCk6Ci0JICAgICAgcmV2ZXJzZV9tZW1jcHkgKCZtNCwgcCwgc2l6ZW9mICht NCkpOworCSAgICAgIG1lbWNweSAoJnUzMiwgcCwgc2l6ZW9mICh1MzIpKTsK KwkgICAgICB1MzIgPSBfX2J1aWx0aW5fYnN3YXAzMiAodTMyKTsKKwkgICAg ICBtNCA9ICooR0ZDX0lOVEVHRVJfNCopJnUzMjsKIAkgICAgICBtID0gbTQ7 CiAJICAgICAgYnJlYWs7CiAKIAkgICAgY2FzZSBzaXplb2YoR0ZDX0lOVEVH RVJfOCk6Ci0JICAgICAgcmV2ZXJzZV9tZW1jcHkgKCZtOCwgcCwgc2l6ZW9m IChtOCkpOworCSAgICAgIG1lbWNweSAoJnU2NCwgcCwgc2l6ZW9mICh1NjQp KTsKKwkgICAgICB1NjQgPSBfX2J1aWx0aW5fYnN3YXA2NCAodTY0KTsKKwkg ICAgICBtOCA9ICooR0ZDX0lOVEVHRVJfOCopJnU2NDsKIAkgICAgICBtID0g bTg7CiAJICAgICAgYnJlYWs7CiAKZGlmZiAtLWdpdCBhL2xpYmdmb3J0cmFu L2lvL2lvLmggYi9saWJnZm9ydHJhbi9pby9pby5oCmluZGV4IDQzYWVhZmQu LmYxN2RlMTkgMTAwNjQ0Ci0tLSBhL2xpYmdmb3J0cmFuL2lvL2lvLmgKKysr IGIvbGliZ2ZvcnRyYW4vaW8vaW8uaApAQCAtNjQ5LDkgKzY0OSw2IEBAIGlu dGVybmFsX3Byb3RvKGluaXRfbG9vcF9zcGVjKTsKIGV4dGVybiB2b2lkIG5l eHRfcmVjb3JkIChzdF9wYXJhbWV0ZXJfZHQgKiwgaW50KTsKIGludGVybmFs X3Byb3RvKG5leHRfcmVjb3JkKTsKIAotZXh0ZXJuIHZvaWQgcmV2ZXJzZV9t ZW1jcHkgKHZvaWQgKiwgY29uc3Qgdm9pZCAqLCBzaXplX3QpOwotaW50ZXJu YWxfcHJvdG8gKHJldmVyc2VfbWVtY3B5KTsKLQogZXh0ZXJuIHZvaWQgc3Rf d2FpdCAoc3RfcGFyYW1ldGVyX3dhaXQgKik7CiBleHBvcnRfcHJvdG8oc3Rf d2FpdCk7CiAKZGlmZiAtLWdpdCBhL2xpYmdmb3J0cmFuL2lvL3RyYW5zZmVy LmMgYi9saWJnZm9ydHJhbi9pby90cmFuc2Zlci5jCmluZGV4IDZkZGExZGYu LmViNzdkZjhhIDEwMDY0NAotLS0gYS9saWJnZm9ydHJhbi9pby90cmFuc2Zl ci5jCisrKyBiL2xpYmdmb3J0cmFuL2lvL3RyYW5zZmVyLmMKQEAgLTg3OCw1 MCArODc4LDkxIEBAIHdyaXRlX2J1ZiAoc3RfcGFyYW1ldGVyX2R0ICpkdHAs IHZvaWQgKmJ1Ziwgc2l6ZV90IG5ieXRlcykKIH0KIAogCi0vKiBNYXN0ZXIg ZnVuY3Rpb24gZm9yIHVuZm9ybWF0dGVkIHJlYWRzLiAgKi8KKy8qIFJldmVy c2UgbWVtY3B5IC0gdXNlZCBmb3IgYnl0ZSBzd2FwcGluZy4gICovCiAKIHN0 YXRpYyB2b2lkCi11bmZvcm1hdHRlZF9yZWFkIChzdF9wYXJhbWV0ZXJfZHQg KmR0cCwgYnQgdHlwZSwKLQkJICB2b2lkICpkZXN0LCBpbnQga2luZCwgc2l6 ZV90IHNpemUsIHNpemVfdCBuZWxlbXMpCityZXZlcnNlX21lbWNweSAodm9p ZCAqZGVzdCwgY29uc3Qgdm9pZCAqc3JjLCBzaXplX3QgbikKIHsKLSAgaWYg KGxpa2VseSAoZHRwLT51LnAuY3VycmVudF91bml0LT5mbGFncy5jb252ZXJ0 ID09IEdGQ19DT05WRVJUX05BVElWRSkKLSAgICAgIHx8IGtpbmQgPT0gMSkK KyAgY2hhciAqZCwgKnM7CisgIHNpemVfdCBpOworCisgIGQgPSAoY2hhciAq KSBkZXN0OworICBzID0gKGNoYXIgKikgc3JjICsgbiAtIDE7CisKKyAgLyog V3JpdGUgd2l0aCBhc2NlbmRpbmcgb3JkZXIgLSB0aGlzIGlzIGxpa2VseSBm YXN0ZXIKKyAgICAgb24gbW9kZXJuIGFyY2hpdGVjdHVyZXMgYmVjYXVzZSBv ZiB3cml0ZSBjb21iaW5pbmcuICAqLworICBmb3IgKGk9MDsgaTxuOyBpKysp CisgICAgICAqKGQrKykgPSAqKHMtLSk7Cit9CisKKworLyogVXRpbGl0eSBm dW5jdGlvbiBmb3IgYnl0ZXN3YXBwaW5nIGFuIGFycmF5LCB1c2luZyB0aGUg YnN3YXAKKyAgIGJ1aWx0aW5zIGlmIHBvc3NpYmxlLiBkZXN0IGFuZCBzcmMg Y2FuIG92ZXJsYXAuICAqLworCitzdGF0aWMgdm9pZAorYnN3YXBfYXJyYXkg KHZvaWQgKmRlc3QsIGNvbnN0IHZvaWQgKnNyYywgc2l6ZV90IHNpemUsIHNp emVfdCBuZWxlbXMpCit7CisgIGNoYXIgYnVmZmVyWzE2XTsKKyAgY29uc3Qg Y2hhciAqcHM7IAorICBjaGFyICpwZDsKKworICBzd2l0Y2ggKHNpemUpCiAg ICAgewotICAgICAgaWYgKHR5cGUgPT0gQlRfQ0hBUkFDVEVSKQotCXNpemUg Kj0gR0ZDX1NJWkVfT0ZfQ0hBUl9LSU5EKGtpbmQpOwotICAgICAgcmVhZF9i bG9ja19kaXJlY3QgKGR0cCwgZGVzdCwgc2l6ZSAqIG5lbGVtcyk7CisgICAg Y2FzZSAxOgorICAgICAgYnJlYWs7CisgICAgY2FzZSAyOgorICAgICAgZm9y IChzaXplX3QgaSA9IDA7IGkgPCBuZWxlbXM7IGkrKykKKwkoKHVpbnQxNl90 KilkZXN0KVtpXSA9IF9fYnVpbHRpbl9ic3dhcDE2ICgoKHVpbnQxNl90Kilz cmMpW2ldKTsKKyAgICAgIGJyZWFrOworICAgIGNhc2UgNDoKKyAgICAgIGZv ciAoc2l6ZV90IGkgPSAwOyBpIDwgbmVsZW1zOyBpKyspCisJKCh1aW50MzJf dCopZGVzdClbaV0gPSBfX2J1aWx0aW5fYnN3YXAzMiAoKCh1aW50MzJfdCop c3JjKVtpXSk7CisgICAgICBicmVhazsKKyAgICBjYXNlIDg6CisgICAgICBm b3IgKHNpemVfdCBpID0gMDsgaSA8IG5lbGVtczsgaSsrKQorCSgodWludDY0 X3QqKWRlc3QpW2ldID0gX19idWlsdGluX2Jzd2FwNjQgKCgodWludDY0X3Qq KXNyYylbaV0pOworICAgICAgYnJlYWs7CisgICAgZGVmYXVsdDoKKyAgICAg IHBzID0gc3JjOworICAgICAgcGQgPSBkZXN0OworICAgICAgZm9yIChzaXpl X3QgaSA9IDA7IGkgPCBuZWxlbXM7IGkrKykKKwl7CisJICByZXZlcnNlX21l bWNweSAoYnVmZmVyLCBwcywgc2l6ZSk7CisJICBtZW1jcHkgKHBkLCBidWZm ZXIsIHNpemUpOworCSAgcHMgKz0gc2l6ZTsKKwkgIHBkICs9IHNpemU7CisJ fQogICAgIH0KLSAgZWxzZQotICAgIHsKLSAgICAgIGNoYXIgYnVmZmVyWzE2 XTsKLSAgICAgIGNoYXIgKnA7Ci0gICAgICBzaXplX3QgaTsKK30KIAotICAg ICAgcCA9IGRlc3Q7CiAKKy8qIE1hc3RlciBmdW5jdGlvbiBmb3IgdW5mb3Jt YXR0ZWQgcmVhZHMuICAqLworCitzdGF0aWMgdm9pZAordW5mb3JtYXR0ZWRf cmVhZCAoc3RfcGFyYW1ldGVyX2R0ICpkdHAsIGJ0IHR5cGUsCisJCSAgdm9p ZCAqZGVzdCwgaW50IGtpbmQsIHNpemVfdCBzaXplLCBzaXplX3QgbmVsZW1z KQoreworICBpZiAodHlwZSA9PSBCVF9DSEFSQUNURVIpCisgICAgc2l6ZSAq PSBHRkNfU0laRV9PRl9DSEFSX0tJTkQoa2luZCk7CisgIHJlYWRfYmxvY2tf ZGlyZWN0IChkdHAsIGRlc3QsIHNpemUgKiBuZWxlbXMpOworCisgIGlmICh1 bmxpa2VseSAoZHRwLT51LnAuY3VycmVudF91bml0LT5mbGFncy5jb252ZXJ0 ID09IEdGQ19DT05WRVJUX1NXQVApCisgICAgICAmJiBraW5kICE9IDEpCisg ICAgewogICAgICAgLyogSGFuZGxlIHdpZGUgY2hyYWN0ZXJzLiAgKi8KLSAg ICAgIGlmICh0eXBlID09IEJUX0NIQVJBQ1RFUiAmJiBraW5kICE9IDEpCi0J ewotCSAgbmVsZW1zICo9IHNpemU7Ci0JICBzaXplID0ga2luZDsKLQl9Cisg ICAgICBpZiAodHlwZSA9PSBCVF9DSEFSQUNURVIpCisgIAl7CisgIAkgIG5l bGVtcyAqPSBzaXplOworICAJICBzaXplID0ga2luZDsKKyAgCX0KIAogICAg ICAgLyogQnJlYWsgdXAgY29tcGxleCBpbnRvIGl0cyBjb25zdGl0dWVudCBy ZWFscy4gICovCi0gICAgICBpZiAodHlwZSA9PSBCVF9DT01QTEVYKQotCXsK LQkgIG5lbGVtcyAqPSAyOwotCSAgc2l6ZSAvPSAyOwotCX0KLSAgICAgIAot ICAgICAgLyogQnkgbm93LCBhbGwgY29tcGxleCB2YXJpYWJsZXMgaGF2ZSBi ZWVuIHNwbGl0IGludG8gdGhlaXIKLQkgY29uc3RpdHVlbnQgcmVhbHMuICAq LwotICAgICAgCi0gICAgICBmb3IgKGkgPSAwOyBpIDwgbmVsZW1zOyBpKysp Ci0JewotIAkgIHJlYWRfYmxvY2tfZGlyZWN0IChkdHAsIGJ1ZmZlciwgc2l6 ZSk7Ci0gCSAgcmV2ZXJzZV9tZW1jcHkgKHAsIGJ1ZmZlciwgc2l6ZSk7Ci0g CSAgcCArPSBzaXplOwotIAl9CisgICAgICBlbHNlIGlmICh0eXBlID09IEJU X0NPTVBMRVgpCisgIAl7CisgIAkgIG5lbGVtcyAqPSAyOworICAJICBzaXpl IC89IDI7CisgIAl9CisgICAgICBic3dhcF9hcnJheSAoZGVzdCwgZGVzdCwg c2l6ZSwgbmVsZW1zKTsKICAgICB9CiB9CiAKQEAgLTk0NSw5ICs5ODYsMTAg QEAgdW5mb3JtYXR0ZWRfd3JpdGUgKHN0X3BhcmFtZXRlcl9kdCAqZHRwLCBi dCB0eXBlLAogICAgIH0KICAgZWxzZQogICAgIHsKLSAgICAgIGNoYXIgYnVm ZmVyWzE2XTsKKyNkZWZpbmUgQlNXQVBfQlVGU1ogNTEyCisgICAgICBjaGFy IGJ1ZmZlcltCU1dBUF9CVUZTWl07CiAgICAgICBjaGFyICpwOwotICAgICAg c2l6ZV90IGk7CisgICAgICBzaXplX3QgbnJlbTsKIAogICAgICAgcCA9IHNv dXJjZTsKIApAQCAtOTY4LDEyICsxMDEwLDIxIEBAIHVuZm9ybWF0dGVkX3dy aXRlIChzdF9wYXJhbWV0ZXJfZHQgKmR0cCwgYnQgdHlwZSwKICAgICAgIC8q IEJ5IG5vdywgYWxsIGNvbXBsZXggdmFyaWFibGVzIGhhdmUgYmVlbiBzcGxp dCBpbnRvIHRoZWlyCiAJIGNvbnN0aXR1ZW50IHJlYWxzLiAgKi8KIAotICAg ICAgZm9yIChpID0gMDsgaSA8IG5lbGVtczsgaSsrKQorICAgICAgbnJlbSA9 IG5lbGVtczsKKyAgICAgIGRvCiAJewotCSAgcmV2ZXJzZV9tZW1jcHkoYnVm ZmVyLCBwLCBzaXplKTsKLSAJICBwICs9IHNpemU7Ci0JICB3cml0ZV9idWYg KGR0cCwgYnVmZmVyLCBzaXplKTsKKwkgIHNpemVfdCBuYzsKKwkgIGlmIChz aXplICogbnJlbSA+IEJTV0FQX0JVRlNaKQorCSAgICBuYyA9IEJTV0FQX0JV RlNaIC8gc2l6ZTsKKwkgIGVsc2UKKwkgICAgbmMgPSBucmVtOworCisJICBi c3dhcF9hcnJheSAoYnVmZmVyLCBwLCBzaXplLCBuYyk7CisJICB3cml0ZV9i dWYgKGR0cCwgYnVmZmVyLCBzaXplICogbmMpOworCSAgcCArPSBzaXplICog bmM7CisJICBucmVtIC09IG5jOwogCX0KKyAgICAgIHdoaWxlIChucmVtID4g MCk7CiAgICAgfQogfQogCkBAIC0yMTUzLDE1ICsyMjA0LDIyIEBAIHVzX3Jl YWQgKHN0X3BhcmFtZXRlcl9kdCAqZHRwLCBpbnQgY29udGludWVkKQogCX0K ICAgICB9CiAgIGVsc2UKKyAgICB7CisgICAgICB1aW50MzJfdCB1MzI7Cisg ICAgICB1aW50NjRfdCB1NjQ7CiAgICAgICBzd2l0Y2ggKG5yKQogCXsKIAlj YXNlIHNpemVvZihHRkNfSU5URUdFUl80KToKLQkgIHJldmVyc2VfbWVtY3B5 ICgmaTQsICZpLCBzaXplb2YgKGk0KSk7CisJICBtZW1jcHkgKCZ1MzIsICZp LCBzaXplb2YgKHUzMikpOworCSAgdTMyID0gX19idWlsdGluX2Jzd2FwMzIg KHUzMik7CisJICBpNCA9ICooR0ZDX0lOVEVHRVJfNCopJnUzMjsKIAkgIGkg PSBpNDsKIAkgIGJyZWFrOwogCiAJY2FzZSBzaXplb2YoR0ZDX0lOVEVHRVJf OCk6Ci0JICByZXZlcnNlX21lbWNweSAoJmk4LCAmaSwgc2l6ZW9mIChpOCkp OworCSAgbWVtY3B5ICgmdTY0LCAmaSwgc2l6ZW9mICh1NjQpKTsKKwkgIHU2 NCA9IF9fYnVpbHRpbl9ic3dhcDY0ICh1NjQpOworCSAgaTggPSAqKEdGQ19J TlRFR0VSXzgqKSZ1NjQ7CiAJICBpID0gaTg7CiAJICBicmVhazsKIApAQCAt MjE2OSw2ICsyMjI3LDcgQEAgdXNfcmVhZCAoc3RfcGFyYW1ldGVyX2R0ICpk dHAsIGludCBjb250aW51ZWQpCiAJICBydW50aW1lX2Vycm9yICgiSWxsZWdh bCB2YWx1ZSBmb3IgcmVjb3JkIG1hcmtlciIpOwogCSAgYnJlYWs7CiAJfQor ICAgIH0KIAogICBpZiAoaSA+PSAwKQogICAgIHsKQEAgLTMwMzYsNyArMzA5 NSw2IEBAIHdyaXRlX3VzX21hcmtlciAoc3RfcGFyYW1ldGVyX2R0ICpkdHAs IGNvbnN0IGdmY19vZmZzZXQgYnVmKQogICBzaXplX3QgbGVuOwogICBHRkNf SU5URUdFUl80IGJ1ZjQ7CiAgIEdGQ19JTlRFR0VSXzggYnVmODsKLSAgY2hh ciBwW3NpemVvZiAoR0ZDX0lOVEVHRVJfOCldOwogCiAgIGlmIChjb21waWxl X29wdGlvbnMucmVjb3JkX21hcmtlciA9PSAwKQogICAgIGxlbiA9IHNpemVv ZiAoR0ZDX0lOVEVHRVJfNCk7CkBAIC0zMDY1LDE4ICszMTIzLDIwIEBAIHdy aXRlX3VzX21hcmtlciAoc3RfcGFyYW1ldGVyX2R0ICpkdHAsIGNvbnN0IGdm Y19vZmZzZXQgYnVmKQogICAgIH0KICAgZWxzZQogICAgIHsKKyAgICAgIHVp bnQzMl90IHUzMjsKKyAgICAgIHVpbnQ2NF90IHU2NDsKICAgICAgIHN3aXRj aCAobGVuKQogCXsKIAljYXNlIHNpemVvZiAoR0ZDX0lOVEVHRVJfNCk6CiAJ ICBidWY0ID0gYnVmOwotCSAgcmV2ZXJzZV9tZW1jcHkgKHAsICZidWY0LCBz aXplb2YgKEdGQ19JTlRFR0VSXzQpKTsKLQkgIHJldHVybiBzd3JpdGUgKGR0 cC0+dS5wLmN1cnJlbnRfdW5pdC0+cywgcCwgbGVuKTsKKwkgIHUzMiA9IF9f YnVpbHRpbl9ic3dhcDMyICgqKHVpbnQzMl90KikmYnVmNCk7CisJICByZXR1 cm4gc3dyaXRlIChkdHAtPnUucC5jdXJyZW50X3VuaXQtPnMsICZ1MzIsIGxl bik7CiAJICBicmVhazsKIAogCWNhc2Ugc2l6ZW9mIChHRkNfSU5URUdFUl84 KToKIAkgIGJ1ZjggPSBidWY7Ci0JICByZXZlcnNlX21lbWNweSAocCwgJmJ1 ZjgsIHNpemVvZiAoR0ZDX0lOVEVHRVJfOCkpOwotCSAgcmV0dXJuIHN3cml0 ZSAoZHRwLT51LnAuY3VycmVudF91bml0LT5zLCBwLCBsZW4pOworCSAgdTY0 ID0gX19idWlsdGluX2Jzd2FwNjQgKCoodWludDY0X3QqKSZidWY4KTsKKwkg IHJldHVybiBzd3JpdGUgKGR0cC0+dS5wLmN1cnJlbnRfdW5pdC0+cywgJnU2 NCwgbGVuKTsKIAkgIGJyZWFrOwogCiAJZGVmYXVsdDoKQEAgLTM3MTMsMjIg KzM3NzMsNiBAQCBzdF9zZXRfbm1sX3Zhcl9kaW0gKHN0X3BhcmFtZXRlcl9k dCAqZHRwLCBHRkNfSU5URUdFUl80IG5fZGltLAogICBHRkNfRElNRU5TSU9O X1NFVChubWwtPmRpbVtuXSxsYm91bmQsdWJvdW5kLHN0cmlkZSk7CiB9CiAK LS8qIFJldmVyc2UgbWVtY3B5IC0gdXNlZCBmb3IgYnl0ZSBzd2FwcGluZy4g ICovCi0KLXZvaWQgcmV2ZXJzZV9tZW1jcHkgKHZvaWQgKmRlc3QsIGNvbnN0 IHZvaWQgKnNyYywgc2l6ZV90IG4pCi17Ci0gIGNoYXIgKmQsICpzOwotICBz aXplX3QgaTsKLQotICBkID0gKGNoYXIgKikgZGVzdDsKLSAgcyA9IChjaGFy ICopIHNyYyArIG4gLSAxOwotCi0gIC8qIFdyaXRlIHdpdGggYXNjZW5kaW5n IG9yZGVyIC0gdGhpcyBpcyBsaWtlbHkgZmFzdGVyCi0gICAgIG9uIG1vZGVy biBhcmNoaXRlY3R1cmVzIGJlY2F1c2Ugb2Ygd3JpdGUgY29tYmluaW5nLiAg Ki8KLSAgZm9yIChpPTA7IGk8bjsgaSsrKQotICAgICAgKihkKyspID0gKihz LS0pOwotfQotCiAKIC8qIE9uY2UgdXBvbiBhIHRpbWUsIGEgcG9vciBpbm5v Y2VudCBGb3J0cmFuIHByb2dyYW0gd2FzIHJlYWRpbmcgYQogICAgZmlsZSwg d2hlbiBzdWRkZW5seSBpdCBoaXQgdGhlIGVuZC1vZi1maWxlIChFT0YpLiAg VW5mb3J0dW5hdGVseQo= --f46d043bd8fe26b25604d27dd0db--