public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Review of the qsort situation
@ 2023-12-04 13:08 Florian Weimer
  2023-12-04 14:32 ` Adhemerval Zanella Netto
  2023-12-05 11:43 ` Sam James
  0 siblings, 2 replies; 4+ messages in thread
From: Florian Weimer @ 2023-12-04 13:08 UTC (permalink / raw)
  To: libc-alpha

Last week, I promised to look at the qsort situation.  I see two major
issues.

(a) Compatibility impact of the new implementation

We saw two different kinds of breakage in LLVM, a hang in 389-ds-base
(the LDAP server), and a test suite failure in notmuch.  I'm pretty sure
this is just the surface, and there is much breakage we don't know about
yet.  Particularly in proprietary software.

So we'd probably have to retain the old implementation and add a symbol
version.  Not great, but manageable.

(b) Performance of the new information

I did not know this previously, but mergesort is rather close to the
theoretical minimum in terms of comparisons.  Quicksort uses something
39% more comparisons on average.

I believe minimizing the number of comparisons is a reasonable goal
given how qsort is typically used: the indirect call has a certain
overhead, and sorting variable-length strings is not uncommon, either.

As an example, I looked at the stringtable sorting in ldconfig.  On my
system, there are 4,049 strings to sort for tail merging.  Because it's
targeting tail merging, this sort does not use strcmp, but a reverse
string comparison.  (It would probably be faster to reverse the strings,
sort, and reverse the strings again.)

The number of comparisons and the median run-times (for the fixed input
table) are

  master:                         57,384   0.657ms
  master with qsort from 2.38:    43,426   0.543ms
  master heapsort only:           84,507   0.921ms
    + Kuan-Wei Chiu patch:        49,917   0.858ms

The numbers fluctuate quite a bit, I assume the branch predictor is
struggling.  They contradict somewhat that minimizing comparisons is
important.

I can try to gather more performance numbers, but I don't think this
looks good.  It's slower, and there are far-ranging compatibility
issues.  So I don't think this is the right direction.  If quicksort was
actually faster than mergesort, it might be a different discussion.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-12-05 11:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-04 13:08 Review of the qsort situation Florian Weimer
2023-12-04 14:32 ` Adhemerval Zanella Netto
2023-12-05 10:40   ` Florian Weimer
2023-12-05 11:43 ` Sam James

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).