From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by sourceware.org (Postfix) with ESMTPS id 531793858D20 for ; Sun, 14 Jan 2024 18:04:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 531793858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=systematicsw.ab.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=systematicsw.ab.ca ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 531793858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=216.40.44.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705255501; cv=none; b=Rl3AodUKbfka5udK8LzFrGa9VkxwfynhzUZpdVUI/MWV699Cxtc7z/QcqeD2Ci+Q4KIXlzIUGWyNqykzOSx9cdLzb1Gaq3RKjvxKzhZhHQ2hgCt8xy/Re8tY5xuv5zNrz76M0WfxqTsGiNrcxlgpHyS+G1ucyX6oKH9HoDMYIXI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705255501; c=relaxed/simple; bh=PqyeggOXdQwlWpqsGQywR4u1zQjjhuucQ379/0h33Wk=; h=Message-ID:Date:MIME-Version:From:Subject:To; b=X4SnLmATtuyisCuOttyff71i9PC3ZzEwPJuxdXzfCTPAcoqIDHTDFGHK4K65UDBOsx02W0fnFMVcFGiBZlZRcrJhGU0TTf5etFrFmhq11yPOx1N5xcJtCy/TcOQR8KQyjZKwq/JqihiG24QmHmLtJKuiqljlzIqAiJMCo+jn/H0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from omf06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7B3C51603C0 for ; Sun, 14 Jan 2024 18:04:58 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: Brian.Inglis@SystematicSW.ab.ca) by omf06.hostedemail.com (Postfix) with ESMTPA id 09DAA2000F for ; Sun, 14 Jan 2024 18:04:55 +0000 (UTC) Message-ID: Date: Sun, 14 Jan 2024 11:04:55 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: brian.inglis@systematicsw.ab.ca Reply-To: newlib@sourceware.org Subject: Re: [PATCH] Implement introsort for qsort Content-Language: en-CA To: newlib@sourceware.org References: <20231230195940.1989559-1-visitorckw@gmail.com> Autocrypt: addr=brian.inglis@systematicsw.ab.ca; keydata= xjMEXopx8xYJKwYBBAHaRw8BAQdAnCK0qv/xwUCCZQoA9BHRYpstERrspfT0NkUWQVuoePbN LkJyaWFuIEluZ2xpcyA8QnJpYW4uSW5nbGlzQFN5c3RlbWF0aWNTdy5hYi5jYT7ClgQTFggA PhYhBMM5/lbU970GBS2bZB62lxu92I8YBQJeinHzAhsDBQkJZgGABQsJCAcCBhUKCQgLAgQW AgMBAh4BAheAAAoJEB62lxu92I8Y0ioBAI8xrggNxziAVmr+Xm6nnyjoujMqWcq3oEhlYGAO WacZAQDFtdDx2koSVSoOmfaOyRTbIWSf9/Cjai29060fsmdsDM44BF6KcfMSCisGAQQBl1UB BQEBB0Awv8kHI2PaEgViDqzbnoe8B9KMHoBZLS92HdC7ZPh8HQMBCAfCfgQYFggAJhYhBMM5 /lbU970GBS2bZB62lxu92I8YBQJeinHzAhsMBQkJZgGAAAoJEB62lxu92I8YZwUBAJw/74rF IyaSsGI7ewCdCy88Lce/kdwX7zGwid+f8NZ3AQC/ezTFFi5obXnyMxZJN464nPXiggtT9gN5 RSyTY8X+AQ== Organization: Systematic Software In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 09DAA2000F X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,URIBL_SBL_A autolearn=no autolearn_force=no version=3.4.6 X-Stat-Signature: rjundtyka3e5kbkb9gb7w7ro58g8by8k X-Rspamd-Server: rspamout04 X-Session-Marker: 427269616E2E496E676C69734053797374656D6174696353572E61622E6361 X-Session-ID: U2FsdGVkX18vUq5CC2KNjnrG1LUuLeCrgGZoDsGRLnk= X-HE-Tag: 1705255495-578507 X-HE-Meta: U2FsdGVkX1+DiL09MSXHlbsKiJd0ooPhr7sCd2+GRPEAgtasY7sNOWEtBgqQbNnW+1rCVQFiW1IN1ZmNh3mhNf+uH0H5zJJ3fItXnd1V1kTnwHWK1PzH7MO/seZcN5mms1LiuaqN8G0GV2Lw7n75Z92bsKIT6DCtxRTwzP/g4rmDVhYb4JYUnSC2mJVmmNDcF3t2ct9+XKiSg5fjo/CjkWVcKwu1J8tsXjW27YUyvBKYapwmbx4Y207eCByFmgNbhwFKOjELFPf4plkta1N7tX+ubZzhEjqslZOZIxZOFIJbVw82FnfoFkQKZmggNHhfGPDSaxxAvSo+zShZnzVxPAuov6mpzgIiPmLyO/aHe60oTT5Gzjnc8NvJkkaEOAtIg3+L28L57f21ctaZ0EIYlvEoNksZw8D/69V9gpnt3ggjUayxKv4pPf2keEqxw6Mo+P25KwTBfpN1jHglPG7HafFqqqWmSbGegwxPMtwNeEOA0NlMvU5RegYP7L4GYLVVcx3IMZ7NHwlylcCS28peLCX66hwLvspZqun2RHYhpg9iFHsJi7Nb6mbLNDYX2VmO4FecWLaLK/dOG6gLoiPVMg== X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2024-01-13 10:25, Kuan-Wei Chiu wrote: > On Sun, Dec 31, 2023 at 03:59:40AM +0800, Kuan-Wei Chiu wrote: >> Enhances the qsort implementation by introducing introsort to address >> the worst-case time complexity of O(n^2) associated with quicksort. >> Introsort is utilized to switch to heapsort when quicksort recursion >> depth becomes excessive, ensuring a worst-case time complexity of >> O(n log n). >> >> The heapsort implementation adopts a bottom-up approach, significantly >> reducing the number of required comparisons and enhancing overall >> efficiency. >> >> Refs: >> Introspective Sorting and Selection Algorithms >> David R. Musser >> Software—Practice & Experience, 27(8); Pages 983–993, Aug 1997 >> https://dl.acm.org/doi/10.5555/261387.261395 >> >> A killer adversary for quicksort >> M. D. McIlroy >> Software—Practice & Experience, 29(4); Pages 341–344, 10 April 1999 >> https://dl.acm.org/doi/10.5555/311868.311871 >> >> BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, >> QUICKSORT (if n is not very small) >> Ingo Wegener >> Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 >> https://dl.acm.org/doi/10.5555/162625.162643 >> >> Signed-off-by: Kuan-Wei Chiu >> --- >> To assess the performance of the new introsort and the old quicksort in >> worst-case scenarios, we examined the number of comparisons required >> for sorting based on the paper "A killer adversary for quicksort." >> >> Before the patch: >> n = 1000, cmp_count = 100853 >> n = 2000, cmp_count = 395991 >> n = 3000, cmp_count = 885617 >> n = 4000, cmp_count = 1569692 >> n = 5000, cmp_count = 2448183 >> n = 6000, cmp_count = 3521140 >> n = 7000, cmp_count = 4788493 >> n = 8000, cmp_count = 6250342 >> n = 9000, cmp_count = 7906610 >> n = 10000, cmp_count = 9757353 >> >> After the patch: >> n = 1000, cmp_count = 27094 >> n = 2000, cmp_count = 61500 >> n = 3000, cmp_count = 100529 >> n = 4000, cmp_count = 136538 >> n = 5000, cmp_count = 182698 >> n = 6000, cmp_count = 221120 >> n = 7000, cmp_count = 259933 >> n = 8000, cmp_count = 299058 >> n = 9000, cmp_count = 356405 >> n = 10000, cmp_count = 397723 >> >> newlib/libc/search/qsort.c | 153 +++++++++++++++++++++++++++++++++++-- >> 1 file changed, 147 insertions(+), 6 deletions(-) > > Hi Brian, > > Thank you for your feedback, and I apologize for the late response. > > I found your reply in the archives of the newlib mailing list on the > web interface. However, for some reason, I did not find the same email > in my email client. > > Upon reviewing the previously mentioned issue, it seems that although > both instances lead to a worst-case degradation to O(n^2), the causes > appear to be different. One may be the adoption of insertion sort at an > inappropriate time, causing degradation, while the other could be > adversarial input leading to the selection of an unsuitable pivot. I > can delve deeper into this issue later, but I believe it should be > different patches. > > I looked at the code at BSD implementation [1], but I am skeptical that > it would address the problem of adversarial input degrading to O(n^2). > I will conduct experiments in subsequent emails to verify this. > > Thank you for reminding me to consider not only the number of > comparisons but also the number of swaps. Indeed, switching to heapsort > may likely increase the required number of swaps, but while the number > of swaps would be at an O(n log n) level, the number of comparisons > would be at an O(n^2) level. Therefore, a trade-off needs to be made, > and I will conduct further experiments. If increasing the number of > swaps does lead to degradation in efficiency, it would be better to > drop this patch. Depending on your target ISA, comparisons can take almost no time with conditional instructions, and there are some tricks for unrolling and hard wiring code to deal with sections of arrays. > Regarding the experimental data, as I am not a native English speaker, > I would like to inquire about the meaning of "two values out of order > repeated to fill the array, in alternating rows and alternate halves." For all data types in your tests (or at least strings), try test sequences like: A B A B B A A B A B ... ... B A B A ... ... B A ... ... to fill your test arrays with data that robust sort algorithms should handle without degradation. A well engineered quicksort should be able to handle these cases in memory without degradation, while heapsort is more useful for managing runs of sorted sequences from I/O buffers when data spills from memory. The literature in (CACM) papers and books by Knuth (tAoCP 3 Sorting...) and Sedgewick (Knuth's pupil and successor whose thesis was quicksort! and wsote Algorithms) covers the data and algorithmic considerations, while implementations and tweaks in papers by McIlroy (some with Bentley and Knuth) and articles and books by Bentley cover the pragmatics. The series of articles below discusses the many problems in the BSD derived (faulty) qsort used by newlib/nano/pico and others, with analyses and numbers: https://www.raygard.net/2022/01/17/Re-engineering-a-qsort-part-1/ showing newlib worse than ideal by 35% to 1000%! Please also note the comments about iterating tail calls with gotos. The articles also mention sort test data sources and test benches, and seems to consider 10k reps of 5 data values as representative tests. The related code qs22j fixes all the known problems, is licensed 0BSD, and could replace the newlib/nano/pico libc qsort: https://github.com/raygard/qsort_dev > This clarification will help me avoid any misunderstanding of the text. > I will post new experimental data within the next few days. > Once again, thank you for your feedback and suggestions! > Link: https://cgit.freebsd.org/src/tree/lib/libc/stdlib/qsort.c [1] -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry