Re: [PATCH] Implement introsort for qsort

public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed

From: brian.inglis@systematicsw.ab.ca
To: newlib@sourceware.org
Subject: Re: [PATCH] Implement introsort for qsort
Date: Sun, 14 Jan 2024 11:04:55 -0700	[thread overview]
Message-ID: <ed43cde7-978e-481e-8f10-8f776ee1152e@systematicsw.ab.ca> (raw)
In-Reply-To: <ZaLHmpsQK3mSPo0x@visitorckw-System-Product-Name>

On 2024-01-13 10:25, Kuan-Wei Chiu wrote:
> On Sun, Dec 31, 2023 at 03:59:40AM +0800, Kuan-Wei Chiu wrote:
>> Enhances the qsort implementation by introducing introsort to address
>> the worst-case time complexity of O(n^2) associated with quicksort.
>> Introsort is utilized to switch to heapsort when quicksort recursion
>> depth becomes excessive, ensuring a worst-case time complexity of
>> O(n log n).
>>
>> The heapsort implementation adopts a bottom-up approach, significantly
>> reducing the number of required comparisons and enhancing overall
>> efficiency.
>>
>> Refs:
>>    Introspective Sorting and Selection Algorithms
>>    David R. Musser
>>    Software—Practice & Experience, 27(8); Pages 983–993, Aug 1997
>>    https://dl.acm.org/doi/10.5555/261387.261395
>>
>>    A killer adversary for quicksort
>>    M. D. McIlroy
>>    Software—Practice & Experience, 29(4); Pages 341–344, 10 April 1999
>>    https://dl.acm.org/doi/10.5555/311868.311871
>>
>>    BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average,
>>    QUICKSORT (if n is not very small)
>>    Ingo Wegener
>>    Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993
>>    https://dl.acm.org/doi/10.5555/162625.162643
>>
>> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
>> ---
>> To assess the performance of the new introsort and the old quicksort in
>> worst-case scenarios, we examined the number of comparisons required
>> for sorting based on the paper "A killer adversary for quicksort."
>>
>> Before the patch:
>> n = 1000,  cmp_count = 100853
>> n = 2000,  cmp_count = 395991
>> n = 3000,  cmp_count = 885617
>> n = 4000,  cmp_count = 1569692
>> n = 5000,  cmp_count = 2448183
>> n = 6000,  cmp_count = 3521140
>> n = 7000,  cmp_count = 4788493
>> n = 8000,  cmp_count = 6250342
>> n = 9000,  cmp_count = 7906610
>> n = 10000,  cmp_count = 9757353
>>
>> After the patch:
>> n = 1000,  cmp_count = 27094
>> n = 2000,  cmp_count = 61500
>> n = 3000,  cmp_count = 100529
>> n = 4000,  cmp_count = 136538
>> n = 5000,  cmp_count = 182698
>> n = 6000,  cmp_count = 221120
>> n = 7000,  cmp_count = 259933
>> n = 8000,  cmp_count = 299058
>> n = 9000,  cmp_count = 356405
>> n = 10000,  cmp_count = 397723
>>
>>   newlib/libc/search/qsort.c | 153 +++++++++++++++++++++++++++++++++++--
>>   1 file changed, 147 insertions(+), 6 deletions(-)
> 
> Hi Brian,
> 
> Thank you for your feedback, and I apologize for the late response.
> 
> I found your reply in the archives of the newlib mailing list on the
> web interface. However, for some reason, I did not find the same email
> in my email client.
> 
> Upon reviewing the previously mentioned issue, it seems that although
> both instances lead to a worst-case degradation to O(n^2), the causes
> appear to be different. One may be the adoption of insertion sort at an
> inappropriate time, causing degradation, while the other could be
> adversarial input leading to the selection of an unsuitable pivot. I
> can delve deeper into this issue later, but I believe it should be
> different patches.
> 
> I looked at the code at BSD implementation [1], but I am skeptical that
> it would address the problem of adversarial input degrading to O(n^2).
> I will conduct experiments in subsequent emails to verify this.
> 
> Thank you for reminding me to consider not only the number of
> comparisons but also the number of swaps. Indeed, switching to heapsort
> may likely increase the required number of swaps, but while the number
> of swaps would be at an O(n log n) level, the number of comparisons
> would be at an O(n^2) level. Therefore, a trade-off needs to be made,
> and I will conduct further experiments. If increasing the number of
> swaps does lead to degradation in efficiency, it would be better to
> drop this patch.

Depending on your target ISA, comparisons can take almost no time with 
conditional instructions, and there are some tricks for unrolling and hard 
wiring code to deal with sections of arrays.

> Regarding the experimental data, as I am not a native English speaker,
> I would like to inquire about the meaning of "two values out of order
> repeated to fill the array, in alternating rows and alternate halves."

For all data types in your tests (or at least strings), try test sequences like:

	A	B	A	B
	B	A	A	B
	A	B	...	...
	B	A	B	A
	...	...	B	A
			...	...

to fill your test arrays with data that robust sort algorithms should handle 
without degradation.

A well engineered quicksort should be able to handle these cases in memory 
without degradation, while heapsort is more useful for managing runs of sorted 
sequences from I/O buffers when data spills from memory.

The literature in (CACM) papers and books by Knuth (tAoCP 3 Sorting...) and 
Sedgewick (Knuth's pupil and successor whose thesis was quicksort! and wsote 
Algorithms) covers the data and algorithmic considerations, while 
implementations and tweaks in papers by McIlroy (some with Bentley and Knuth) 
and articles and books by Bentley cover the pragmatics.

The series of articles below discusses the many problems in the BSD derived 
(faulty) qsort used by newlib/nano/pico and others, with analyses and numbers:

	https://www.raygard.net/2022/01/17/Re-engineering-a-qsort-part-1/

showing newlib worse than ideal by 35% to 1000%!
Please also note the comments about iterating tail calls with gotos.
The articles also mention sort test data sources and test benches, and seems to 
consider 10k reps of 5 data values as representative tests.

The related code qs22j fixes all the known problems, is licensed 0BSD, and could 
replace the newlib/nano/pico libc qsort:

	https://github.com/raygard/qsort_dev

> This clarification will help me avoid any misunderstanding of the text.
> I will post new experimental data within the next few days.
> Once again, thank you for your feedback and suggestions!
> Link: https://cgit.freebsd.org/src/tree/lib/libc/stdlib/qsort.c [1]

-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                 -- Antoine de Saint-Exupéry

next prev parent reply	other threads:[~2024-01-14 18:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-30 19:59 Kuan-Wei Chiu
2023-12-31  6:20 ` brian.inglis
2024-01-13 17:25 ` Kuan-Wei Chiu
2024-01-14 18:04   ` brian.inglis [this message]
2024-01-15 23:38 ` Kuan-Wei Chiu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ed43cde7-978e-481e-8f10-8f776ee1152e@systematicsw.ab.ca \
    --to=brian.inglis@systematicsw.ab.ca \
    --cc=newlib@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).