public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Review of the qsort situation
@ 2023-12-04 13:08 Florian Weimer
  2023-12-04 14:32 ` Adhemerval Zanella Netto
  2023-12-05 11:43 ` Sam James
  0 siblings, 2 replies; 4+ messages in thread
From: Florian Weimer @ 2023-12-04 13:08 UTC (permalink / raw)
  To: libc-alpha

Last week, I promised to look at the qsort situation.  I see two major
issues.

(a) Compatibility impact of the new implementation

We saw two different kinds of breakage in LLVM, a hang in 389-ds-base
(the LDAP server), and a test suite failure in notmuch.  I'm pretty sure
this is just the surface, and there is much breakage we don't know about
yet.  Particularly in proprietary software.

So we'd probably have to retain the old implementation and add a symbol
version.  Not great, but manageable.

(b) Performance of the new information

I did not know this previously, but mergesort is rather close to the
theoretical minimum in terms of comparisons.  Quicksort uses something
39% more comparisons on average.

I believe minimizing the number of comparisons is a reasonable goal
given how qsort is typically used: the indirect call has a certain
overhead, and sorting variable-length strings is not uncommon, either.

As an example, I looked at the stringtable sorting in ldconfig.  On my
system, there are 4,049 strings to sort for tail merging.  Because it's
targeting tail merging, this sort does not use strcmp, but a reverse
string comparison.  (It would probably be faster to reverse the strings,
sort, and reverse the strings again.)

The number of comparisons and the median run-times (for the fixed input
table) are

  master:                         57,384   0.657ms
  master with qsort from 2.38:    43,426   0.543ms
  master heapsort only:           84,507   0.921ms
    + Kuan-Wei Chiu patch:        49,917   0.858ms

The numbers fluctuate quite a bit, I assume the branch predictor is
struggling.  They contradict somewhat that minimizing comparisons is
important.

I can try to gather more performance numbers, but I don't think this
looks good.  It's slower, and there are far-ranging compatibility
issues.  So I don't think this is the right direction.  If quicksort was
actually faster than mergesort, it might be a different discussion.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Review of the qsort situation
  2023-12-04 13:08 Review of the qsort situation Florian Weimer
@ 2023-12-04 14:32 ` Adhemerval Zanella Netto
  2023-12-05 10:40   ` Florian Weimer
  2023-12-05 11:43 ` Sam James
  1 sibling, 1 reply; 4+ messages in thread
From: Adhemerval Zanella Netto @ 2023-12-04 14:32 UTC (permalink / raw)
  To: Florian Weimer, libc-alpha



On 04/12/23 10:08, Florian Weimer wrote:
> Last week, I promised to look at the qsort situation.  I see two major
> issues.
> 
> (a) Compatibility impact of the new implementation
> 
> We saw two different kinds of breakage in LLVM, a hang in 389-ds-base
> (the LDAP server), and a test suite failure in notmuch.  I'm pretty sure
> this is just the surface, and there is much breakage we don't know about
> yet.  Particularly in proprietary software.
> 
> So we'd probably have to retain the old implementation and add a symbol
> version.  Not great, but manageable.

Are still compatibility issues besides the sort callback being called
with same pointer (which should be fixed yet)?

I tried to model the introsort based on libstdc++ std::sort, so it is
kinda surprising that we are seeing extra compatibilities issues.

In any case, this still a latent issues that was not triggered before
because the quicksort was not taken in most cases.  On a memory pressure
situation, the quicksort will be used and we caller will face the same
compatibility issues.

> 
> (b) Performance of the new information
> 
> I did not know this previously, but mergesort is rather close to the
> theoretical minimum in terms of comparisons.  Quicksort uses something
> 39% more comparisons on average.
> 
> I believe minimizing the number of comparisons is a reasonable goal
> given how qsort is typically used: the indirect call has a certain
> overhead, and sorting variable-length strings is not uncommon, either.
> 
> As an example, I looked at the stringtable sorting in ldconfig.  On my
> system, there are 4,049 strings to sort for tail merging.  Because it's
> targeting tail merging, this sort does not use strcmp, but a reverse
> string comparison.  (It would probably be faster to reverse the strings,
> sort, and reverse the strings again.)
> 
> The number of comparisons and the median run-times (for the fixed input
> table) are
> 
>   master:                         57,384   0.657ms
>   master with qsort from 2.38:    43,426   0.543ms
>   master heapsort only:           84,507   0.921ms
>     + Kuan-Wei Chiu patch:        49,917   0.858ms
> 
> The numbers fluctuate quite a bit, I assume the branch predictor is
> struggling.  They contradict somewhat that minimizing comparisons is
> important.
> 
> I can try to gather more performance numbers, but I don't think this
> looks good.  It's slower, and there are far-ranging compatibility
> issues.  So I don't think this is the right direction.  If quicksort was
> actually faster than mergesort, it might be a different discussion.

Yes, the quicksort performance is usually worse than mergesort and I did
advertise it on cover letter.  The change was more motivated to remove
the malloc usage, so the function is fully AS-signal-safe and to simplify
corner case where memory pressure might trigger quicksort and its O(n^2)
worse case.

So we can add back a compatibility symbol or even just revert this change,
but it does not fully solve the corner case issue on what to do if malloc
fails. Should we abort, like gcc_sort does on gcc? Should we fallback to
heapsort (it might the best option in this case)?

POSIX does not really define qsort to be sync-signal-safe, although all
other implementations I am aware of makes it (usually by using quicksort
with O(1) space requirement).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Review of the qsort situation
  2023-12-04 14:32 ` Adhemerval Zanella Netto
@ 2023-12-05 10:40   ` Florian Weimer
  0 siblings, 0 replies; 4+ messages in thread
From: Florian Weimer @ 2023-12-05 10:40 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: libc-alpha

* Adhemerval Zanella Netto:

> On 04/12/23 10:08, Florian Weimer wrote:
>> Last week, I promised to look at the qsort situation.  I see two major
>> issues.
>> 
>> (a) Compatibility impact of the new implementation
>> 
>> We saw two different kinds of breakage in LLVM, a hang in 389-ds-base
>> (the LDAP server), and a test suite failure in notmuch.  I'm pretty sure
>> this is just the surface, and there is much breakage we don't know about
>> yet.  Particularly in proprietary software.
>> 
>> So we'd probably have to retain the old implementation and add a symbol
>> version.  Not great, but manageable.
>
> Are still compatibility issues besides the sort callback being called
> with same pointer (which should be fixed yet)?

Lack of stability is the major issue, and that's also not going to be
fixable.  But if qsort were actually faster, that might be a compelling
reason to deal with the compatibility fallout.

> In any case, this still a latent issues that was not triggered before
> because the quicksort was not taken in most cases.  On a memory pressure
> situation, the quicksort will be used and we caller will face the same
> compatibility issues.

Memory pressure is not relevant to short arrays.  The issues we saw
obviously were with short arrays.

>> I can try to gather more performance numbers, but I don't think this
>> looks good.  It's slower, and there are far-ranging compatibility
>> issues.  So I don't think this is the right direction.  If quicksort was
>> actually faster than mergesort, it might be a different discussion.
>
> Yes, the quicksort performance is usually worse than mergesort and I did
> advertise it on cover letter.

But that's really counter-intuitive.  Shouldn't quicksort be faster?

What if we specialize the implementation for pointer-sized values, so
that swapping does not require indirect calls or conditional branches?

> So we can add back a compatibility symbol or even just revert this change,
> but it does not fully solve the corner case issue on what to do if malloc
> fails. Should we abort, like gcc_sort does on gcc? Should we fallback to
> heapsort (it might the best option in this case)?

That doesn't seem unreasonable—it has a fairly compact implementation.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Review of the qsort situation
  2023-12-04 13:08 Review of the qsort situation Florian Weimer
  2023-12-04 14:32 ` Adhemerval Zanella Netto
@ 2023-12-05 11:43 ` Sam James
  1 sibling, 0 replies; 4+ messages in thread
From: Sam James @ 2023-12-05 11:43 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha


Florian Weimer <fweimer@redhat.com> writes:

> Last week, I promised to look at the qsort situation.  I see two major
> issues.
>
> (a) Compatibility impact of the new implementation
>
> We saw two different kinds of breakage in LLVM, a hang in 389-ds-base
> (the LDAP server), and a test suite failure in notmuch.  I'm pretty sure
> this is just the surface, and there is much breakage we don't know about
> yet.  Particularly in proprietary software.

Yes, these reports are from people testing master early, and
historically not _that_ many people do that, so I was surprised some of
that got noticed early (the reports weren't just from e.g. fedora or
anything).

Nobody on our end is testing it yet, for example.

Anyway, I at least don't have the appetite/energy to try chase down
these at the moment given occupied with the other porting..

>
> So we'd probably have to retain the old implementation and add a symbol
> version.  Not great, but manageable.
>
> [...]

thanks,
sam

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-12-05 11:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-04 13:08 Review of the qsort situation Florian Weimer
2023-12-04 14:32 ` Adhemerval Zanella Netto
2023-12-05 10:40   ` Florian Weimer
2023-12-05 11:43 ` Sam James

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).