From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 2EA3E3858C27 for ; Mon, 18 Jul 2022 07:30:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2EA3E3858C27 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6D5E834E38; Mon, 18 Jul 2022 07:30:29 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 56D9213A37; Mon, 18 Jul 2022 07:30:29 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id rUn0ExUM1WLYcwAAMHmgww (envelope-from ); Mon, 18 Jul 2022 07:30:29 +0000 Content-Type: multipart/mixed; boundary="------------PkOyhdId99q0ph6iZa1aMryn" Message-ID: Date: Mon, 18 Jul 2022 09:30:28 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [RFC][gdbsupport] Improve thread scheduling in parallel_for_each Content-Language: en-US To: Tom Tromey , Tom de Vries via Gdb-patches References: <20220715094034.GA10751@delia.home> <87y1wumc1f.fsf@tromey.com> From: Tom de Vries In-Reply-To: <87y1wumc1f.fsf@tromey.com> X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2022 07:30:32 -0000 This is a multi-part message in MIME format. --------------PkOyhdId99q0ph6iZa1aMryn Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 7/15/22 21:05, Tom Tromey wrote: >>>>>> "Tom" == Tom de Vries via Gdb-patches writes: > > Tom> This introduces a performance regression on a particular test-case I happened > Tom> to use: > Tom> ... > Tom> $ for n in $(seq 1 10); do \ > Tom> time gdb -q -batch libxul.so.debug 2>&1 | grep real:; \ > Tom> done > Tom> ... > Tom> so revert to the original schedule by reducing the worker threads: > Tom> ... > > This seems like making a change and then undoing it somewhere else? > > Tom> Still, the performance experiment yields a slight performance loss. > > Sounds bad. > > Tom> if (n_threads < 0) > Tom> - n_threads = std::thread::hardware_concurrency (); > Tom> + { > Tom> + n_threads = std::thread::hardware_concurrency (); > Tom> + if (n_threads > 0) > Tom> + /* Account for main thread. */ > Tom> + n_threads--; > Tom> + } > > I think it's better if the setting just directly controls how many > threads there are. Then elsewhere we can decide what that means -- like > if it performs better with the defaults to not do any work in the main > thread, then parallel_for_each can be modified to just send tasks to the > workers and do nothing in the main thread except wait for results. > > Tom> size_t elts_per_thread = 0; > [...] > Tom> + elts_per_thread = n_elements / n_threads; > > The initial declaration can be removed and then this latter line can > declare the variable as well. > > Tom> for (int i = 0; i < count; ++i) > Tom> { > Tom> RandomIt end = first + elts_per_thread; > Tom> + if (i < left_over) > Tom> + end++; > > It may be nice to mention the distribution of leftovers in a comment > somewhere. I've ended up committing this patch, which just does the leftovers distribution part, with some comments added. Thanks, - Tom --------------PkOyhdId99q0ph6iZa1aMryn Content-Type: text/x-patch; charset=UTF-8; name="0001-gdbsupport-Improve-thread-scheduling-in-parallel_for_each.patch" Content-Disposition: attachment; filename*0="0001-gdbsupport-Improve-thread-scheduling-in-parallel_for_ea"; filename*1="ch.patch" Content-Transfer-Encoding: base64 W2dkYnN1cHBvcnRdIEltcHJvdmUgdGhyZWFkIHNjaGVkdWxpbmcgaW4gcGFyYWxsZWxfZm9y X2VhY2gKCldoZW4gcnVubmluZyBhIHRhc2sgdXNpbmcgcGFyYWxsZWxfZm9yX2VhY2gsIHdl IGdldCB0aGUgZm9sbG93aW5nCmRpc3RyaWJ1dGlvbjoKLi4uClBhcmFsbGVsIGZvcjogbl9l bGVtZW50czogNzI3MQpQYXJhbGxlbCBmb3I6IG1pbmltdW0gZWxlbWVudHMgcGVyIHRocmVh ZDogMTAKUGFyYWxsZWwgZm9yOiBlbHRzX3Blcl90aHJlYWQ6IDE4MTcKUGFyYWxsZWwgZm9y OiBlbGVtZW50cyBvbiB3b3JrZXIgdGhyZWFkIDAgICAgICAgOiAxODE3ClBhcmFsbGVsIGZv cjogZWxlbWVudHMgb24gd29ya2VyIHRocmVhZCAxICAgICAgIDogMTgxNwpQYXJhbGxlbCBm b3I6IGVsZW1lbnRzIG9uIHdvcmtlciB0aHJlYWQgMiAgICAgICA6IDE4MTcKUGFyYWxsZWwg Zm9yOiBlbGVtZW50cyBvbiB3b3JrZXIgdGhyZWFkIDMgICAgICAgOiAwClBhcmFsbGVsIGZv cjogZWxlbWVudHMgb24gbWFpbiB0aHJlYWQgICAgICAgICAgIDogMTgyMAouLi4KCk5vdGUg dGhhdCB0aGVyZSBhcmUgNCBhY3RpdmUgdGhyZWFkcywgYW5kIHNjaGVkdWxpbmcgZWx0c19w ZXJfdGhyZWFkIG9uIGVhY2gKb2YgdGhvc2UgaGFuZGxlcyA0ICogMTgxNyA9PSA3MjY4LCBs ZWF2aW5nIDMgImxlZnQgb3ZlciIgZWxlbWVudHMuCgpUaGVzZSBsZWZ0b3ZlcnMgYXJlIGN1 cnJlbnRseSBoYW5kbGVkIGluIHRoZSBtYWluIHRocmVhZC4KClRoYXQgZG9lc24ndCBzZWVt IHRvIG1hdHRlciBtdWNoIGZvciB0aGlzIGV4YW1wbGUsIGJ1dCBmb3Igc2F5IDEwIHRocmVh ZHMgYW5kCjk5IGVsZW1lbnRzLCB5b3UnZCBoYXZlIDkgdGhyZWFkcyBoYW5kbGluZyA5IGVs ZW1lbnRzIGFuZCAxIHRocmVhZCBoYW5kbGluZyAxOAplbGVtZW50cy4KCkluc3RlYWQsIGRp c3RyaWJ1dGUgdGhlIGxlZnQgb3ZlciBlbGVtZW50cyBvdmVyIHRoZSB3b3JrZXIgdGhyZWFk cywgc3VjaCB0aGF0CndlIGhhdmU6Ci4uLgpQYXJhbGxlbCBmb3I6IGVsZW1lbnRzIG9uIHdv cmtlciB0aHJlYWQgMCAgICAgICA6IDE4MTgKUGFyYWxsZWwgZm9yOiBlbGVtZW50cyBvbiB3 b3JrZXIgdGhyZWFkIDEgICAgICAgOiAxODE4ClBhcmFsbGVsIGZvcjogZWxlbWVudHMgb24g d29ya2VyIHRocmVhZCAyICAgICAgIDogMTgxOApQYXJhbGxlbCBmb3I6IGVsZW1lbnRzIG9u IHdvcmtlciB0aHJlYWQgMyAgICAgICA6IDAKUGFyYWxsZWwgZm9yOiBlbGVtZW50cyBvbiBt YWluIHRocmVhZCAgICAgICAgICAgOiAxODE3Ci4uLgoKVGVzdGVkIG9uIHg4Nl82NC1saW51 eC4KCi0tLQogZ2Ric3VwcG9ydC9wYXJhbGxlbC1mb3IuaCB8IDggKysrKysrKysKIDEgZmls ZSBjaGFuZ2VkLCA4IGluc2VydGlvbnMoKykKCmRpZmYgLS1naXQgYS9nZGJzdXBwb3J0L3Bh cmFsbGVsLWZvci5oIGIvZ2Ric3VwcG9ydC9wYXJhbGxlbC1mb3IuaAppbmRleCBjZmU4YTZl NGYwOS4uYmY0MGYxMjVmMGYgMTAwNjQ0Ci0tLSBhL2dkYnN1cHBvcnQvcGFyYWxsZWwtZm9y LmgKKysrIGIvZ2Ric3VwcG9ydC9wYXJhbGxlbC1mb3IuaApAQCAtMTQ3LDYgKzE0Nyw4IEBA IHBhcmFsbGVsX2Zvcl9lYWNoICh1bnNpZ25lZCBuLCBSYW5kb21JdCBmaXJzdCwgUmFuZG9t SXQgbGFzdCwKICAgc2l6ZV90IG5fdGhyZWFkcyA9IG5fd29ya2VyX3RocmVhZHM7CiAgIHNp emVfdCBuX2VsZW1lbnRzID0gbGFzdCAtIGZpcnN0OwogICBzaXplX3QgZWx0c19wZXJfdGhy ZWFkID0gMDsKKyAgc2l6ZV90IGVsdHNfbGVmdF9vdmVyID0gMDsKKwogICBpZiAobl90aHJl YWRzID4gMSkKICAgICB7CiAgICAgICAvKiBSZXF1aXJlIHRoYXQgdGhlcmUgc2hvdWxkIGJl IGF0IGxlYXN0IE4gZWxlbWVudHMgaW4gYQpAQCAtMTU1LDYgKzE1Nyw4IEBAIHBhcmFsbGVs X2Zvcl9lYWNoICh1bnNpZ25lZCBuLCBSYW5kb21JdCBmaXJzdCwgUmFuZG9tSXQgbGFzdCwK ICAgICAgIGlmIChuX2VsZW1lbnRzIC8gbl90aHJlYWRzIDwgbikKIAluX3RocmVhZHMgPSBz dGQ6Om1heCAobl9lbGVtZW50cyAvIG4sIChzaXplX3QpIDEpOwogICAgICAgZWx0c19wZXJf dGhyZWFkID0gbl9lbGVtZW50cyAvIG5fdGhyZWFkczsKKyAgICAgIGVsdHNfbGVmdF9vdmVy ID0gbl9lbGVtZW50cyAlIG5fdGhyZWFkczsKKyAgICAgIC8qIG5fZWxlbWVudHMgPT0gbl90 aHJlYWRzICogZWx0c19wZXJfdGhyZWFkICsgZWx0c19sZWZ0X292ZXIuICovCiAgICAgfQog CiAgIHNpemVfdCBjb3VudCA9IG5fdGhyZWFkcyA9PSAwID8gMCA6IG5fdGhyZWFkcyAtIDE7 CkBAIC0xNzAsNiArMTc0LDEwIEBAIHBhcmFsbGVsX2Zvcl9lYWNoICh1bnNpZ25lZCBuLCBS YW5kb21JdCBmaXJzdCwgUmFuZG9tSXQgbGFzdCwKICAgZm9yIChpbnQgaSA9IDA7IGkgPCBj b3VudDsgKytpKQogICAgIHsKICAgICAgIFJhbmRvbUl0IGVuZCA9IGZpcnN0ICsgZWx0c19w ZXJfdGhyZWFkOworICAgICAgaWYgKGkgPCBlbHRzX2xlZnRfb3ZlcikKKwkvKiBEaXN0cmli dXRlIHRoZSBsZWZ0b3ZlcnMgb3ZlciB0aGUgd29ya2VyIHRocmVhZHMsIHRvIGF2b2lkIGhh dmluZworCSAgIHRvIGhhbmRsZSBhbGwgb2YgdGhlbSBpbiBhIHNpbmdsZSB0aHJlYWQuICAq LworCWVuZCsrOwogICAgICAgaWYgKHBhcmFsbGVsX2Zvcl9lYWNoX2RlYnVnKQogCWRlYnVn X3ByaW50ZiAoXygiUGFyYWxsZWwgZm9yOiBlbGVtZW50cyBvbiB3b3JrZXIgdGhyZWFkICVp XHQ6ICV6dVxuIiksCiAJCSAgICAgIGksIChzaXplX3QpKGVuZCAtIGZpcnN0KSk7Cg== --------------PkOyhdId99q0ph6iZa1aMryn--