* [PATCH] stdlib: Optimize number of calls to comparison function @ 2023-12-02 21:48 Kuan-Wei Chiu 2023-12-04 8:20 ` Florian Weimer 2024-02-16 7:08 ` [PATCH] stdlib: Optimize number of calls to comparison function Kuan-Wei Chiu 0 siblings, 2 replies; 19+ messages in thread From: Kuan-Wei Chiu @ 2023-12-02 21:48 UTC (permalink / raw) To: libc-alpha; +Cc: adhemerval.zanella, goldstein.w.n, Kuan-Wei Chiu The current heapsort implementation in the siftdown function follows the standard textbook version, necessitating two comparisons at each level. Transitioning to the Bottom-up heapsort version allows us to decrease the required comparisons to just one per level. On average, this modification significantly reduces the comparison count from 2nlog2(n) - 3n + o(n) to nlog2(n) + 0.37*n + o(n). Refs: BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, QUICKSORT (if n is not very small) Ingo Wegener Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 https://doi.org/10.1016/0304-3975(93)90364-Y Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> --- stdlib/qsort.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index be01fb5598..f5babef150 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -132,26 +132,26 @@ static inline void siftdown (void *base, size_t size, size_t k, size_t n, enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) { - /* There can only be a heap condition violation if there are - children. */ - while (2 * k + 1 <= n) - { - /* Left child. */ - size_t j = 2 * k + 1; - /* If the right child is larger, use it. */ - if (j < n && cmp (base + (j * size), base + ((j + 1) * size), arg) < 0) - j++; - - /* If k is already >= to its children, we are done. */ - if (j == k || cmp (base + (k * size), base + (j * size), arg) >= 0) - break; + size_t i, j; + + /* Find the sift-down path all the way to the leaves. */ + for (i = k; j = 2 * i + 1, j + 1 <= n;) + i = cmp (base + j * size, base + (j + 1) * size, arg) >= 0 ? j : (j + 1); - /* Heal the violation. */ - do_swap (base + (size * j), base + (k * size), size, swap_type); + /* Special case for the last leaf with no sibling. */ + if (j == n) + i = j; - /* Swapping with j may have introduced a violation at j. Fix - it in the next loop iteration. */ - k = j; + /* Backtrack to the correct location. */ + while (i != k && cmp (base + k * size, base + i * size, arg) > 0) + i = (i - 1) / 2; + + /* Shift the element into its correct place. */ + j = i; + while (i != k) + { + i = (i - 1) / 2; + do_swap (base + i * size, base + j * size, size, swap_type); } } -- 2.25.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-02 21:48 [PATCH] stdlib: Optimize number of calls to comparison function Kuan-Wei Chiu @ 2023-12-04 8:20 ` Florian Weimer 2023-12-04 18:31 ` Kuan-Wei Chiu 2023-12-05 3:22 ` [PATCH v2 0/2] stdlib: Optimize number of calls to comparison function in qsort Kuan-Wei Chiu 2024-02-16 7:08 ` [PATCH] stdlib: Optimize number of calls to comparison function Kuan-Wei Chiu 1 sibling, 2 replies; 19+ messages in thread From: Florian Weimer @ 2023-12-04 8:20 UTC (permalink / raw) To: Kuan-Wei Chiu; +Cc: libc-alpha, adhemerval.zanella, goldstein.w.n * Kuan-Wei Chiu: > The current heapsort implementation in the siftdown function follows > the standard textbook version, necessitating two comparisons at each > level. Transitioning to the Bottom-up heapsort version allows us to > decrease the required comparisons to just one per level. On average, > this modification significantly reduces the comparison count from > 2nlog2(n) - 3n + o(n) to nlog2(n) + 0.37*n + o(n). I think the factor in stdlib/tst-qsort5.c needs to be adjusted: /* This is an arbitrary factor which is true for the current implementation across a wide range of sizes. */ TEST_VERIFY (factor <= 4.5); On the other hand, the heapsort code is not really expected to run at all in the current implementation because median-of-three quicksort usually avoids degenerating behavior. If this heapsort variant uses a number of comparisons that is competitive to quicksort, maybe we should use it instead? And use insertion sort for short arrays only. Thanks, Florian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-04 8:20 ` Florian Weimer @ 2023-12-04 18:31 ` Kuan-Wei Chiu 2023-12-05 10:44 ` Florian Weimer 2023-12-05 20:35 ` Zack Weinberg 2023-12-05 3:22 ` [PATCH v2 0/2] stdlib: Optimize number of calls to comparison function in qsort Kuan-Wei Chiu 1 sibling, 2 replies; 19+ messages in thread From: Kuan-Wei Chiu @ 2023-12-04 18:31 UTC (permalink / raw) To: Florian Weimer; +Cc: libc-alpha, adhemerval.zanella, goldstein.w.n On Mon, Dec 04, 2023 at 09:20:50AM +0100, Florian Weimer wrote: > I think the factor in stdlib/tst-qsort5.c needs to be adjusted: > > /* This is an arbitrary factor which is true for the current > implementation across a wide range of sizes. */ > TEST_VERIFY (factor <= 4.5); > It seems that the factor can be adjusted to around 3.5. I can send another patch for this adjustment or resend it as a patch series. > On the other hand, the heapsort code is not really expected to run at > all in the current implementation because median-of-three quicksort > usually avoids degenerating behavior. If this heapsort variant uses a > number of comparisons that is competitive to quicksort, maybe we should > use it instead? And use insertion sort for short arrays only. > This patch currently reduces the comparison count only in scenarios where deep recursion results in a fallback to heapsort. According to the referenced paper, beyond n >= 16000, the comparison count of bottom-up heapsort is lower than that of median-of-three quicksort. However, my concern is that quicksort typically exhibits better locality of reference, making it more cache-friendly, while heapsort lacks this advantage. Best regards, Kuan-Wei Chiu ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-04 18:31 ` Kuan-Wei Chiu @ 2023-12-05 10:44 ` Florian Weimer 2023-12-05 20:00 ` Kuan-Wei Chiu 2023-12-05 20:35 ` Zack Weinberg 1 sibling, 1 reply; 19+ messages in thread From: Florian Weimer @ 2023-12-05 10:44 UTC (permalink / raw) To: Kuan-Wei Chiu; +Cc: libc-alpha, adhemerval.zanella, goldstein.w.n * Kuan-Wei Chiu: > On Mon, Dec 04, 2023 at 09:20:50AM +0100, Florian Weimer wrote: >> I think the factor in stdlib/tst-qsort5.c needs to be adjusted: >> >> /* This is an arbitrary factor which is true for the current >> implementation across a wide range of sizes. */ >> TEST_VERIFY (factor <= 4.5); >> > > It seems that the factor can be adjusted to around 3.5. I can send > another patch for this adjustment or resend it as a patch series. Based on the paper, I would have expected a greater reduction. >> On the other hand, the heapsort code is not really expected to run at >> all in the current implementation because median-of-three quicksort >> usually avoids degenerating behavior. If this heapsort variant uses a >> number of comparisons that is competitive to quicksort, maybe we should >> use it instead? And use insertion sort for short arrays only. >> > > This patch currently reduces the comparison count only in scenarios > where deep recursion results in a fallback to heapsort. According to > the referenced paper, beyond n >= 16000, the comparison count of > bottom-up heapsort is lower than that of median-of-three quicksort. > However, my concern is that quicksort typically exhibits better > locality of reference, making it more cache-friendly, while heapsort > lacks this advantage. Yes, the performance numbers I posted suggest that something like that is at play here. Even though heapsort with your patch performs fewer comparisons to most other implementations we have floating around (except the historic merge sort), it is still slower than most. Thanks, Florian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-05 10:44 ` Florian Weimer @ 2023-12-05 20:00 ` Kuan-Wei Chiu 0 siblings, 0 replies; 19+ messages in thread From: Kuan-Wei Chiu @ 2023-12-05 20:00 UTC (permalink / raw) To: Florian Weimer; +Cc: libc-alpha, adhemerval.zanella, goldstein.w.n On Tue, Dec 05, 2023 at 11:44:03AM +0100, Florian Weimer wrote: > * Kuan-Wei Chiu: > > > On Mon, Dec 04, 2023 at 09:20:50AM +0100, Florian Weimer wrote: > >> I think the factor in stdlib/tst-qsort5.c needs to be adjusted: > >> > >> /* This is an arbitrary factor which is true for the current > >> implementation across a wide range of sizes. */ > >> TEST_VERIFY (factor <= 4.5); > >> > > > > It seems that the factor can be adjusted to around 3.5. I can send > > another patch for this adjustment or resend it as a patch series. > > Based on the paper, I would have expected a greater reduction. > The adjustment factor of 3.5 is derived from data obtained using the tst-qsort5 test. Upon examining the number of elements sorted by heapsort during the test and the corresponding comparison counts, the comparison count is approximately nlog2(n) + 0.83*n. This is higher than the analysis in the paper, which suggests nlog2(n) + d(n)*n, where d(n) belongs to the range [0.34, 0.39]. I believe this discrepancy is due to the paper's focus on the average comparison count in random scenarios, whereas here, we are intentionally creating adversarial permutations, making it not purely random. Additionally, in the performance numbers you posted, d(n) is approximately equal to 0.34, aligning with the analysis in the paper. Here is the test data: heapsort: number of comparisons, number of elements Before applying the patch (commit 8e755f5bc8): info: length 100: 1832 comparisons ~ 2.438387 * n * log (n) heapsort: 662, 76 info: length 1000: 34020 comparisons ~ 3.260275 * n * log (n) heapsort: 15324, 964 info: length 10000: 494661 comparisons ~ 3.772689 * n * log (n) heapsort: 225309, 9948 info: length 20000: 1071780 comparisons ~ 3.859536 * n * log (n) heapsort: 492534, 19944 info: length 30000: 1644872 comparisons ~ 3.830672 * n * log (n) heapsort: 775626, 29944 info: length 40000: 2305025 comparisons ~ 3.933328 * n * log (n) heapsort: 1065893, 39940 info: length 50000: 2912911 comparisons ~ 3.913976 * n * log (n) heapsort: 1363779, 49940 info: length 60000: 3529897 comparisons ~ 3.902134 * n * log (n) heapsort: 1670765, 59940 info: length 70000: 4285743 comparisons ~ 4.009278 * n * log (n) heapsort: 1976733, 69936 info: length 80000: 4929859 comparisons ~ 3.998699 * n * log (n) heapsort: 2290849, 79936 info: length 90000: 5575320 comparisons ~ 3.987933 * n * log (n) heapsort: 2606310, 89936 info: length 100000: 6223572 comparisons ~ 3.978286 * n * log (n) heapsort: 2924562, 99936 info: length 110000: 6881296 comparisons ~ 3.973322 * n * log (n) heapsort: 3252286, 109936 info: length 120000: 7537050 comparisons ~ 3.966365 * n * log (n) heapsort: 3578040, 119936 info: length 130000: 8191258 comparisons ~ 3.958248 * n * log (n) heapsort: 3902248, 129936 info: length 140000: 9138135 comparisons ~ 4.072406 * n * log (n) heapsort: 4239255, 139932 info: length 150000: 9822309 comparisons ~ 4.067167 * n * log (n) heapsort: 4573429, 149932 After applying the patch: info: length 100: 1690 comparisons ~ 2.273802 * n * log (n) heapsort: 520, 76 info: length 1000: 28993 comparisons ~ 2.821755 * n * log (n) heapsort: 10297, 964 info: length 10000: 409589 comparisons ~ 3.169480 * n * log (n) heapsort: 140237, 9948 info: length 20000: 879212 comparisons ~ 3.211902 * n * log (n) heapsort: 299966, 19944 info: length 30000: 1336536 comparisons ~ 3.158407 * n * log (n) heapsort: 467290, 29944 info: length 40000: 1882150 comparisons ~ 3.256782 * n * log (n) heapsort: 643018, 39940 info: length 50000: 2369603 comparisons ~ 3.228731 * n * log (n) heapsort: 820471, 49940 info: length 60000: 2859511 comparisons ~ 3.205839 * n * log (n) heapsort: 1000379, 59940 info: length 70000: 3492146 comparisons ~ 3.311278 * n * log (n) heapsort: 1183136, 69936 info: length 80000: 4006690 comparisons ~ 3.294222 * n * log (n) heapsort: 1367680, 79936 info: length 90000: 4523504 comparisons ~ 3.279727 * n * log (n) heapsort: 1554494, 89936 info: length 100000: 5041697 comparisons ~ 3.266775 * n * log (n) heapsort: 1742687, 99936 info: length 110000: 5562597 comparisons ~ 3.255888 * n * log (n) heapsort: 1933587, 109936 info: length 120000: 6083545 comparisons ~ 3.245368 * n * log (n) heapsort: 2124535, 119936 info: length 130000: 6604920 comparisons ~ 3.235434 * n * log (n) heapsort: 2315910, 129936 info: length 140000: 7407274 comparisons ~ 3.344872 * n * log (n) heapsort: 2508394, 139932 info: length 150000: 7951873 comparisons ~ 3.336444 * n * log (n) heapsort: 2702993, 149932 Best regards, Kuan-Wei Chiu > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-04 18:31 ` Kuan-Wei Chiu 2023-12-05 10:44 ` Florian Weimer @ 2023-12-05 20:35 ` Zack Weinberg 2023-12-06 10:21 ` Florian Weimer 1 sibling, 1 reply; 19+ messages in thread From: Zack Weinberg @ 2023-12-05 20:35 UTC (permalink / raw) To: Kuan-Wei Chiu, Florian Weimer Cc: GNU libc development, Adhemerval Zanella, goldstein.w.n On Mon, Dec 4, 2023, at 1:31 PM, Kuan-Wei Chiu wrote: > On Mon, Dec 04, 2023 at 09:20:50AM +0100, Florian Weimer wrote: >> I think the factor in stdlib/tst-qsort5.c needs to be adjusted: >> >> /* This is an arbitrary factor which is true for the current >> implementation across a wide range of sizes. */ >> TEST_VERIFY (factor <= 4.5); > > It seems that the factor can be adjusted to around 3.5. I can send > another patch for this adjustment or resend it as a patch series. Before you go any further with this patch series I think we need to decide whether our backward compatibility constraints mean our qsort has to be a stable sort. If so, we should make it *always* be a stable sort and write that into the documentation, and that would mean junking the entire heapsort implementation. I do not have a strong opinion either way, but I do think it should either *always* or *never* be stable, i.e. not the previous state of "whether you get a stable sort depends on the size of the input and various other undocumented factors." Also, I think that the behavior should not depend on which version of glibc a program was compiled against. zw ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-05 20:35 ` Zack Weinberg @ 2023-12-06 10:21 ` Florian Weimer 2023-12-06 12:51 ` Adhemerval Zanella Netto 0 siblings, 1 reply; 19+ messages in thread From: Florian Weimer @ 2023-12-06 10:21 UTC (permalink / raw) To: Zack Weinberg Cc: Kuan-Wei Chiu, GNU libc development, Adhemerval Zanella, goldstein.w.n * Zack Weinberg: > On Mon, Dec 4, 2023, at 1:31 PM, Kuan-Wei Chiu wrote: >> On Mon, Dec 04, 2023 at 09:20:50AM +0100, Florian Weimer wrote: >>> I think the factor in stdlib/tst-qsort5.c needs to be adjusted: >>> >>> /* This is an arbitrary factor which is true for the current >>> implementation across a wide range of sizes. */ >>> TEST_VERIFY (factor <= 4.5); >> >> It seems that the factor can be adjusted to around 3.5. I can send >> another patch for this adjustment or resend it as a patch series. > > Before you go any further with this patch series I think we need to > decide whether our backward compatibility constraints mean our qsort > has to be a stable sort. If so, we should make it *always* be a > stable sort and write that into the documentation, and that would > mean junking the entire heapsort implementation. That makes sense, although there might not exist an in-place sorting algorithm that takes constant extra space regardless of element count and element size and has reasonable performance. Maybe we could say, “stable if element size is less than 1024 bytes” or something like that. What I'm concerned about is that with the current implementation, we take a performance hit *and* have compatibility problems. The compatibility problems would be easier to justify if we actually made things faster. Not calling malloc internally is unlikely to be compelling to programmers. Thanks, Florian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-06 10:21 ` Florian Weimer @ 2023-12-06 12:51 ` Adhemerval Zanella Netto 2023-12-17 15:42 ` Zack Weinberg 0 siblings, 1 reply; 19+ messages in thread From: Adhemerval Zanella Netto @ 2023-12-06 12:51 UTC (permalink / raw) To: Florian Weimer, Zack Weinberg Cc: Kuan-Wei Chiu, GNU libc development, goldstein.w.n On 06/12/23 07:21, Florian Weimer wrote: > * Zack Weinberg: > >> On Mon, Dec 4, 2023, at 1:31 PM, Kuan-Wei Chiu wrote: >>> On Mon, Dec 04, 2023 at 09:20:50AM +0100, Florian Weimer wrote: >>>> I think the factor in stdlib/tst-qsort5.c needs to be adjusted: >>>> >>>> /* This is an arbitrary factor which is true for the current >>>> implementation across a wide range of sizes. */ >>>> TEST_VERIFY (factor <= 4.5); >>> >>> It seems that the factor can be adjusted to around 3.5. I can send >>> another patch for this adjustment or resend it as a patch series. >> >> Before you go any further with this patch series I think we need to >> decide whether our backward compatibility constraints mean our qsort >> has to be a stable sort. If so, we should make it *always* be a >> stable sort and write that into the documentation, and that would >> mean junking the entire heapsort implementation. > > That makes sense, although there might not exist an in-place sorting > algorithm that takes constant extra space regardless of element count > and element size and has reasonable performance. Maybe we could say, > “stable if element size is less than 1024 bytes” or something like that. > > What I'm concerned about is that with the current implementation, we > take a performance hit *and* have compatibility problems. The > compatibility problems would be easier to justify if we actually made > things faster. Not calling malloc internally is unlikely to be > compelling to programmers. My initial intention was to remove internal multiple sorting strategies that has different semantics and that might eventually trigger corner cases issues (the stable default mergesort versus the fallback quicksort), and fix the quicksort fallback that had the long standing issue of O(n^2) on adversarial inputs. However it seems that glibc qsort now become an instance on Hyrum's law, where it does not really matter that neither POSIX nor C standard promised sorting stability. So I presume it would be better to keep with mergesort for all input sizes, along with the limited stack buffer (so there is no failure point up a certain input number/size); and fallback to heapsort on memory allocation failure. I will prepare a patch to restore it. Other system does provide mergesort as a different symbol [1], which also returns a failure is case memory can not be allocated. But I don't think it would be an improvement in this situation (no one will really adapt their code to use mergesort if a stable allocation is required). I also though about adding a tunable to enable mergesort on qsort, but it would ending up only add more complexity and required more testing. [1] https://man.freebsd.org/cgi/man.cgi?query=mergesort&sektion=3 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-06 12:51 ` Adhemerval Zanella Netto @ 2023-12-17 15:42 ` Zack Weinberg 2023-12-17 15:55 ` Florian Weimer 2023-12-17 16:47 ` Kuan-Wei Chiu 0 siblings, 2 replies; 19+ messages in thread From: Zack Weinberg @ 2023-12-17 15:42 UTC (permalink / raw) To: Adhemerval Zanella, Florian Weimer Cc: Kuan-Wei Chiu, GNU libc development, Noah Goldstein On Wed, Dec 6, 2023, at 7:51 AM, Adhemerval Zanella Netto wrote: > On 06/12/23 07:21, Florian Weimer wrote: >> * Zack Weinberg: >>> >>> Before you go any further with this patch series I think we need to >>> decide whether our backward compatibility constraints mean our qsort >>> has to be a stable sort. If so, we should make it *always* be a >>> stable sort and write that into the documentation, and that would >>> mean junking the entire heapsort implementation. >> >> That makes sense, although there might not exist an in-place sorting >> algorithm that takes constant extra space regardless of element >> count and element size and has reasonable performance. Maybe we >> could say, “stable if element size is less than 1024 bytes” or >> something like that. It occurred to me late last night that *any* in-place sorting algorithm that operates on an array, can be made stable with no additional storage cost, by using each element's address as a tiebreaker for the comparison function. It *doesn't* need to be the element's original address -- all that matters is that for every pair of elements that compare equal, the sorting algorithm preserves the < relation for their addresses across all swaps, which should happen naturally if address-< is used as a tiebreaker for comparisons. (That said, I still think we should give that "block exchange mergesort" algorithm I dug up the other week a try. I might have time early in January to put a patch set together.) zw ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-17 15:42 ` Zack Weinberg @ 2023-12-17 15:55 ` Florian Weimer 2023-12-17 16:47 ` Kuan-Wei Chiu 1 sibling, 0 replies; 19+ messages in thread From: Florian Weimer @ 2023-12-17 15:55 UTC (permalink / raw) To: Zack Weinberg Cc: Adhemerval Zanella, Kuan-Wei Chiu, GNU libc development, Noah Goldstein * Zack Weinberg: > It occurred to me late last night that *any* in-place sorting algorithm > that operates on an array, can be made stable with no additional storage > cost, by using each element's address as a tiebreaker for the comparison > function. It *doesn't* need to be the element's original address -- all > that matters is that for every pair of elements that compare equal, the > sorting algorithm preserves the < relation for their addresses across > all swaps, which should happen naturally if address-< is used as a > tiebreaker for comparisons. It's not as simple as that because we might have A < B, so we can move A to a lower address than B, but at the same time time, we must make sure that we do not move it past an element C with A == C, otherwise the result of the tie-breaking address comparison changes. I haven't really thought about it, but I suspect it's possible to do for some of the quadratic algorithms, but quicksort and heapsort—not so much. Thanks, Florian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-17 15:42 ` Zack Weinberg 2023-12-17 15:55 ` Florian Weimer @ 2023-12-17 16:47 ` Kuan-Wei Chiu 2023-12-17 18:02 ` Florian Weimer 1 sibling, 1 reply; 19+ messages in thread From: Kuan-Wei Chiu @ 2023-12-17 16:47 UTC (permalink / raw) To: Zack Weinberg, Florian Weimer, Adhemerval Zanella Cc: GNU libc development, Noah Goldstein On Sun, Dec 17, 2023 at 10:42:17AM -0500, Zack Weinberg wrote: > It occurred to me late last night that *any* in-place sorting algorithm > that operates on an array, can be made stable with no additional storage > cost, by using each element's address as a tiebreaker for the comparison > function. It *doesn't* need to be the element's original address -- all > that matters is that for every pair of elements that compare equal, the > sorting algorithm preserves the < relation for their addresses across > all swaps, which should happen naturally if address-< is used as a > tiebreaker for comparisons. > > (That said, I still think we should give that "block exchange mergesort" > algorithm I dug up the other week a try. I might have time early in > January to put a patch set together.) Additionally, if we're open to exploring non-in-place options, I think Timsort might be worth considering. It's widely adopted, including in Python and Android, serving as a stable hybrid sorting algorithm. Timsort is an adaptive sort algorithm that operates in O(n) time when the input is already sorted and maintains a worst-case time complexity of O(n log n), similar to mergesort. I'd be happy to prepare a patch for Timsort next month if needed. Link: https://github.com/python/cpython/blob/main/Objects/listsort.txt Best regards, Kuan-Wei Chiu ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-17 16:47 ` Kuan-Wei Chiu @ 2023-12-17 18:02 ` Florian Weimer 0 siblings, 0 replies; 19+ messages in thread From: Florian Weimer @ 2023-12-17 18:02 UTC (permalink / raw) To: Kuan-Wei Chiu Cc: Zack Weinberg, Adhemerval Zanella, GNU libc development, Noah Goldstein * Kuan-Wei Chiu: > On Sun, Dec 17, 2023 at 10:42:17AM -0500, Zack Weinberg wrote: >> It occurred to me late last night that *any* in-place sorting algorithm >> that operates on an array, can be made stable with no additional storage >> cost, by using each element's address as a tiebreaker for the comparison >> function. It *doesn't* need to be the element's original address -- all >> that matters is that for every pair of elements that compare equal, the >> sorting algorithm preserves the < relation for their addresses across >> all swaps, which should happen naturally if address-< is used as a >> tiebreaker for comparisons. >> >> (That said, I still think we should give that "block exchange mergesort" >> algorithm I dug up the other week a try. I might have time early in >> January to put a patch set together.) > > Additionally, if we're open to exploring non-in-place options, I think > Timsort might be worth considering. It's widely adopted, including in > Python and Android, serving as a stable hybrid sorting algorithm. > Timsort is an adaptive sort algorithm that operates in O(n) time when > the input is already sorted and maintains a worst-case time complexity > of O(n log n), similar to mergesort. I'd be happy to prepare a patch > for Timsort next month if needed. I think we really want to get rid of that malloc in qsort. This means that we need something in-place-ish: we can throw a fair amount of stack space at the problem (probably quite a bit more of the circa 1024 bytes we use today), but not something that scales linearly with the size of the input array. We cannot not even even assume space for just one temporary object on the stack, I think. Thanks, Florian ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 0/2] stdlib: Optimize number of calls to comparison function in qsort 2023-12-04 8:20 ` Florian Weimer 2023-12-04 18:31 ` Kuan-Wei Chiu @ 2023-12-05 3:22 ` Kuan-Wei Chiu 2023-12-05 3:22 ` [PATCH v2 1/2] " Kuan-Wei Chiu 2023-12-05 3:22 ` [PATCH v2 2/2] stdlib: Adjust the factor in tst-qsort5 Kuan-Wei Chiu 1 sibling, 2 replies; 19+ messages in thread From: Kuan-Wei Chiu @ 2023-12-05 3:22 UTC (permalink / raw) To: fweimer; +Cc: libc-alpha, adhemerval.zanella, goldstein.w.n, Kuan-Wei Chiu The current heapsort implementation in the siftdown function follows the standard textbook version, necessitating two comparisons at each level. Transitioning to the Bottom-up heapsort version allows us to decrease the required comparisons to just one per level. On average, this modification significantly reduces the comparison count from 2nlog2(n) - 3n + O(n) to nlog2(n) + 0.37*n + O(n). Refs: BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, QUICKSORT (if n is not very small) Ingo Wegener Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 https://doi.org/10.1016/0304-3975(93)90364-Y Changes from v1: * Use a more concise title. * Correct the error in asymptotic notation from little-o to big-O. * Adjusted the factor in tst-qsort5 from 4.5 to 3.5. Kuan-Wei Chiu (2): stdlib: Optimize number of calls to comparison function in qsort stdlib: Adjust the factor in tst-qsort5 stdlib/qsort.c | 36 ++++++++++++++++++------------------ stdlib/tst-qsort5.c | 2 +- 2 files changed, 19 insertions(+), 19 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 1/2] stdlib: Optimize number of calls to comparison function in qsort 2023-12-05 3:22 ` [PATCH v2 0/2] stdlib: Optimize number of calls to comparison function in qsort Kuan-Wei Chiu @ 2023-12-05 3:22 ` Kuan-Wei Chiu 2023-12-05 3:22 ` [PATCH v2 2/2] stdlib: Adjust the factor in tst-qsort5 Kuan-Wei Chiu 1 sibling, 0 replies; 19+ messages in thread From: Kuan-Wei Chiu @ 2023-12-05 3:22 UTC (permalink / raw) To: fweimer; +Cc: libc-alpha, adhemerval.zanella, goldstein.w.n, Kuan-Wei Chiu The current heapsort implementation in the siftdown function follows the standard textbook version, necessitating two comparisons at each level. Transitioning to the Bottom-up heapsort version allows us to decrease the required comparisons to just one per level. On average, this modification significantly reduces the comparison count from 2nlog2(n) - 3n + O(n) to nlog2(n) + 0.37*n + O(n). Refs: BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, QUICKSORT (if n is not very small) Ingo Wegener Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 https://doi.org/10.1016/0304-3975(93)90364-Y Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> --- Changes from v1: * Use a more concise title. * Correct the error in asymptotic notation from little-o to big-O. stdlib/qsort.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 62477010b6..0d924f5e2d 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -132,26 +132,26 @@ static inline void siftdown (void *base, size_t size, size_t k, size_t n, enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) { - /* There can only be a heap condition violation if there are - children. */ - while (2 * k + 1 <= n) - { - /* Left child. */ - size_t j = 2 * k + 1; - /* If the right child is larger, use it. */ - if (j < n && cmp (base + (j * size), base + ((j + 1) * size), arg) < 0) - j++; - - /* If k is already >= to its children, we are done. */ - if (j == k || cmp (base + (k * size), base + (j * size), arg) >= 0) - break; + size_t i, j; + + /* Find the sift-down path all the way to the leaves. */ + for (i = k; j = 2 * i + 1, j + 1 <= n;) + i = cmp (base + j * size, base + (j + 1) * size, arg) >= 0 ? j : (j + 1); - /* Heal the violation. */ - do_swap (base + (size * j), base + (k * size), size, swap_type); + /* Special case for the last leaf with no sibling. */ + if (j == n) + i = j; - /* Swapping with j may have introduced a violation at j. Fix - it in the next loop iteration. */ - k = j; + /* Backtrack to the correct location. */ + while (i != k && cmp (base + k * size, base + i * size, arg) > 0) + i = (i - 1) / 2; + + /* Shift the element into its correct place. */ + j = i; + while (i != k) + { + i = (i - 1) / 2; + do_swap (base + i * size, base + j * size, size, swap_type); } } -- 2.25.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 2/2] stdlib: Adjust the factor in tst-qsort5 2023-12-05 3:22 ` [PATCH v2 0/2] stdlib: Optimize number of calls to comparison function in qsort Kuan-Wei Chiu 2023-12-05 3:22 ` [PATCH v2 1/2] " Kuan-Wei Chiu @ 2023-12-05 3:22 ` Kuan-Wei Chiu 1 sibling, 0 replies; 19+ messages in thread From: Kuan-Wei Chiu @ 2023-12-05 3:22 UTC (permalink / raw) To: fweimer; +Cc: libc-alpha, adhemerval.zanella, goldstein.w.n, Kuan-Wei Chiu Adjust the factor in the tst-qsort5 from 4.5 to 3.5 to better align with the current implementation characteristics. This refinement is prompted by the adoption of the bottom-up heapsort optimization, as it significantly reduces the required comparisons. Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> --- Changes from v1: * Adjusted the factor in tst-qsort5 from 4.5 to 3.5. stdlib/tst-qsort5.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/stdlib/tst-qsort5.c b/stdlib/tst-qsort5.c index d3a88c30f8..63590967a6 100644 --- a/stdlib/tst-qsort5.c +++ b/stdlib/tst-qsort5.c @@ -155,7 +155,7 @@ check_one_n (size_t n) n, count, factor); /* This is an arbitrary factor which is true for the current implementation across a wide range of sizes. */ - TEST_VERIFY (factor <= 4.5); + TEST_VERIFY (factor <= 3.5); } static int -- 2.25.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2023-12-02 21:48 [PATCH] stdlib: Optimize number of calls to comparison function Kuan-Wei Chiu 2023-12-04 8:20 ` Florian Weimer @ 2024-02-16 7:08 ` Kuan-Wei Chiu 2024-03-27 19:45 ` Adhemerval Zanella Netto 1 sibling, 1 reply; 19+ messages in thread From: Kuan-Wei Chiu @ 2024-02-16 7:08 UTC (permalink / raw) To: libc-alpha; +Cc: adhemerval.zanella, goldstein.w.n, fweimer, zack On Sun, Dec 03, 2023 at 05:48:39AM +0800, Kuan-Wei Chiu wrote: > The current heapsort implementation in the siftdown function follows > the standard textbook version, necessitating two comparisons at each > level. Transitioning to the Bottom-up heapsort version allows us to > decrease the required comparisons to just one per level. On average, > this modification significantly reduces the comparison count from > 2nlog2(n) - 3n + o(n) to nlog2(n) + 0.37*n + o(n). > > Refs: > BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, > QUICKSORT (if n is not very small) > Ingo Wegener > Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 > https://doi.org/10.1016/0304-3975(93)90364-Y > > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > --- > stdlib/qsort.c | 36 ++++++++++++++++++------------------ > 1 file changed, 18 insertions(+), 18 deletions(-) > > diff --git a/stdlib/qsort.c b/stdlib/qsort.c > index be01fb5598..f5babef150 100644 > --- a/stdlib/qsort.c > +++ b/stdlib/qsort.c > @@ -132,26 +132,26 @@ static inline void > siftdown (void *base, size_t size, size_t k, size_t n, > enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) > { > - /* There can only be a heap condition violation if there are > - children. */ > - while (2 * k + 1 <= n) > - { > - /* Left child. */ > - size_t j = 2 * k + 1; > - /* If the right child is larger, use it. */ > - if (j < n && cmp (base + (j * size), base + ((j + 1) * size), arg) < 0) > - j++; > - > - /* If k is already >= to its children, we are done. */ > - if (j == k || cmp (base + (k * size), base + (j * size), arg) >= 0) > - break; > + size_t i, j; > + > + /* Find the sift-down path all the way to the leaves. */ > + for (i = k; j = 2 * i + 1, j + 1 <= n;) > + i = cmp (base + j * size, base + (j + 1) * size, arg) >= 0 ? j : (j + 1); > > - /* Heal the violation. */ > - do_swap (base + (size * j), base + (k * size), size, swap_type); > + /* Special case for the last leaf with no sibling. */ > + if (j == n) > + i = j; > > - /* Swapping with j may have introduced a violation at j. Fix > - it in the next loop iteration. */ > - k = j; > + /* Backtrack to the correct location. */ > + while (i != k && cmp (base + k * size, base + i * size, arg) > 0) > + i = (i - 1) / 2; > + > + /* Shift the element into its correct place. */ > + j = i; > + while (i != k) > + { > + i = (i - 1) / 2; > + do_swap (base + i * size, base + j * size, size, swap_type); > } > } > > -- > 2.25.1 > Since we still retain heapsort as a fallback for mergesort, should we reconsider applying this patch? Thanks, Kuan-Wei ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2024-02-16 7:08 ` [PATCH] stdlib: Optimize number of calls to comparison function Kuan-Wei Chiu @ 2024-03-27 19:45 ` Adhemerval Zanella Netto 2024-03-27 19:59 ` Kuan-Wei Chiu 0 siblings, 1 reply; 19+ messages in thread From: Adhemerval Zanella Netto @ 2024-03-27 19:45 UTC (permalink / raw) To: Kuan-Wei Chiu, libc-alpha; +Cc: goldstein.w.n, fweimer, zack On 16/02/24 04:08, Kuan-Wei Chiu wrote: > On Sun, Dec 03, 2023 at 05:48:39AM +0800, Kuan-Wei Chiu wrote: >> The current heapsort implementation in the siftdown function follows >> the standard textbook version, necessitating two comparisons at each >> level. Transitioning to the Bottom-up heapsort version allows us to >> decrease the required comparisons to just one per level. On average, >> this modification significantly reduces the comparison count from >> 2nlog2(n) - 3n + o(n) to nlog2(n) + 0.37*n + o(n). >> >> Refs: >> BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, >> QUICKSORT (if n is not very small) >> Ingo Wegener >> Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 >> https://doi.org/10.1016/0304-3975(93)90364-Y >> >> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> >> --- >> stdlib/qsort.c | 36 ++++++++++++++++++------------------ >> 1 file changed, 18 insertions(+), 18 deletions(-) >> >> diff --git a/stdlib/qsort.c b/stdlib/qsort.c >> index be01fb5598..f5babef150 100644 >> --- a/stdlib/qsort.c >> +++ b/stdlib/qsort.c >> @@ -132,26 +132,26 @@ static inline void >> siftdown (void *base, size_t size, size_t k, size_t n, >> enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) >> { >> - /* There can only be a heap condition violation if there are >> - children. */ >> - while (2 * k + 1 <= n) >> - { >> - /* Left child. */ >> - size_t j = 2 * k + 1; >> - /* If the right child is larger, use it. */ >> - if (j < n && cmp (base + (j * size), base + ((j + 1) * size), arg) < 0) >> - j++; >> - >> - /* If k is already >= to its children, we are done. */ >> - if (j == k || cmp (base + (k * size), base + (j * size), arg) >= 0) >> - break; >> + size_t i, j; >> + >> + /* Find the sift-down path all the way to the leaves. */ >> + for (i = k; j = 2 * i + 1, j + 1 <= n;) >> + i = cmp (base + j * size, base + (j + 1) * size, arg) >= 0 ? j : (j + 1); >> >> - /* Heal the violation. */ >> - do_swap (base + (size * j), base + (k * size), size, swap_type); >> + /* Special case for the last leaf with no sibling. */ >> + if (j == n) >> + i = j; >> >> - /* Swapping with j may have introduced a violation at j. Fix >> - it in the next loop iteration. */ >> - k = j; >> + /* Backtrack to the correct location. */ >> + while (i != k && cmp (base + k * size, base + i * size, arg) > 0) >> + i = (i - 1) / 2; >> + >> + /* Shift the element into its correct place. */ >> + j = i; >> + while (i != k) >> + { >> + i = (i - 1) / 2; >> + do_swap (base + i * size, base + j * size, size, swap_type); >> } >> } >> >> -- >> 2.25.1 >> > > Since we still retain heapsort as a fallback for mergesort, should we > reconsider applying this patch? > > Thanks, > Kuan-Wei Yes, I think it is work and it looks great to me. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2024-03-27 19:45 ` Adhemerval Zanella Netto @ 2024-03-27 19:59 ` Kuan-Wei Chiu 2024-03-27 20:42 ` Adhemerval Zanella Netto 0 siblings, 1 reply; 19+ messages in thread From: Kuan-Wei Chiu @ 2024-03-27 19:59 UTC (permalink / raw) To: Adhemerval Zanella Netto; +Cc: libc-alpha, goldstein.w.n, fweimer, zack, jserv On Wed, Mar 27, 2024 at 04:45:35PM -0300, Adhemerval Zanella Netto wrote: > > > On 16/02/24 04:08, Kuan-Wei Chiu wrote: > > On Sun, Dec 03, 2023 at 05:48:39AM +0800, Kuan-Wei Chiu wrote: > >> The current heapsort implementation in the siftdown function follows > >> the standard textbook version, necessitating two comparisons at each > >> level. Transitioning to the Bottom-up heapsort version allows us to > >> decrease the required comparisons to just one per level. On average, > >> this modification significantly reduces the comparison count from > >> 2nlog2(n) - 3n + o(n) to nlog2(n) + 0.37*n + o(n). > >> > >> Refs: > >> BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, > >> QUICKSORT (if n is not very small) > >> Ingo Wegener > >> Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 > >> https://doi.org/10.1016/0304-3975(93)90364-Y > >> > >> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > >> --- > >> stdlib/qsort.c | 36 ++++++++++++++++++------------------ > >> 1 file changed, 18 insertions(+), 18 deletions(-) > >> > >> diff --git a/stdlib/qsort.c b/stdlib/qsort.c > >> index be01fb5598..f5babef150 100644 > >> --- a/stdlib/qsort.c > >> +++ b/stdlib/qsort.c > >> @@ -132,26 +132,26 @@ static inline void > >> siftdown (void *base, size_t size, size_t k, size_t n, > >> enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) > >> { > >> - /* There can only be a heap condition violation if there are > >> - children. */ > >> - while (2 * k + 1 <= n) > >> - { > >> - /* Left child. */ > >> - size_t j = 2 * k + 1; > >> - /* If the right child is larger, use it. */ > >> - if (j < n && cmp (base + (j * size), base + ((j + 1) * size), arg) < 0) > >> - j++; > >> - > >> - /* If k is already >= to its children, we are done. */ > >> - if (j == k || cmp (base + (k * size), base + (j * size), arg) >= 0) > >> - break; > >> + size_t i, j; > >> + > >> + /* Find the sift-down path all the way to the leaves. */ > >> + for (i = k; j = 2 * i + 1, j + 1 <= n;) > >> + i = cmp (base + j * size, base + (j + 1) * size, arg) >= 0 ? j : (j + 1); > >> > >> - /* Heal the violation. */ > >> - do_swap (base + (size * j), base + (k * size), size, swap_type); > >> + /* Special case for the last leaf with no sibling. */ > >> + if (j == n) > >> + i = j; > >> > >> - /* Swapping with j may have introduced a violation at j. Fix > >> - it in the next loop iteration. */ > >> - k = j; > >> + /* Backtrack to the correct location. */ > >> + while (i != k && cmp (base + k * size, base + i * size, arg) > 0) > >> + i = (i - 1) / 2; > >> + > >> + /* Shift the element into its correct place. */ > >> + j = i; > >> + while (i != k) > >> + { > >> + i = (i - 1) / 2; > >> + do_swap (base + i * size, base + j * size, size, swap_type); > >> } > >> } > >> > >> -- > >> 2.25.1 > >> > > > > Since we still retain heapsort as a fallback for mergesort, should we > > reconsider applying this patch? > > > > Thanks, > > Kuan-Wei > > Yes, I think it is work and it looks great to me. Should I resend a new patch, or can you directly apply this one? Thanks, Kuan-Wei ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] stdlib: Optimize number of calls to comparison function 2024-03-27 19:59 ` Kuan-Wei Chiu @ 2024-03-27 20:42 ` Adhemerval Zanella Netto 0 siblings, 0 replies; 19+ messages in thread From: Adhemerval Zanella Netto @ 2024-03-27 20:42 UTC (permalink / raw) To: Kuan-Wei Chiu; +Cc: libc-alpha, goldstein.w.n, fweimer, zack, jserv On 27/03/24 16:59, Kuan-Wei Chiu wrote: > On Wed, Mar 27, 2024 at 04:45:35PM -0300, Adhemerval Zanella Netto wrote: >> >> >> On 16/02/24 04:08, Kuan-Wei Chiu wrote: >>> On Sun, Dec 03, 2023 at 05:48:39AM +0800, Kuan-Wei Chiu wrote: >>>> The current heapsort implementation in the siftdown function follows >>>> the standard textbook version, necessitating two comparisons at each >>>> level. Transitioning to the Bottom-up heapsort version allows us to >>>> decrease the required comparisons to just one per level. On average, >>>> this modification significantly reduces the comparison count from >>>> 2nlog2(n) - 3n + o(n) to nlog2(n) + 0.37*n + o(n). >>>> >>>> Refs: >>>> BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, >>>> QUICKSORT (if n is not very small) >>>> Ingo Wegener >>>> Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 >>>> https://doi.org/10.1016/0304-3975(93)90364-Y >>>> >>>> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> >>>> --- >>>> stdlib/qsort.c | 36 ++++++++++++++++++------------------ >>>> 1 file changed, 18 insertions(+), 18 deletions(-) >>>> >>>> diff --git a/stdlib/qsort.c b/stdlib/qsort.c >>>> index be01fb5598..f5babef150 100644 >>>> --- a/stdlib/qsort.c >>>> +++ b/stdlib/qsort.c >>>> @@ -132,26 +132,26 @@ static inline void >>>> siftdown (void *base, size_t size, size_t k, size_t n, >>>> enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) >>>> { >>>> - /* There can only be a heap condition violation if there are >>>> - children. */ >>>> - while (2 * k + 1 <= n) >>>> - { >>>> - /* Left child. */ >>>> - size_t j = 2 * k + 1; >>>> - /* If the right child is larger, use it. */ >>>> - if (j < n && cmp (base + (j * size), base + ((j + 1) * size), arg) < 0) >>>> - j++; >>>> - >>>> - /* If k is already >= to its children, we are done. */ >>>> - if (j == k || cmp (base + (k * size), base + (j * size), arg) >= 0) >>>> - break; >>>> + size_t i, j; >>>> + >>>> + /* Find the sift-down path all the way to the leaves. */ >>>> + for (i = k; j = 2 * i + 1, j + 1 <= n;) >>>> + i = cmp (base + j * size, base + (j + 1) * size, arg) >= 0 ? j : (j + 1); >>>> >>>> - /* Heal the violation. */ >>>> - do_swap (base + (size * j), base + (k * size), size, swap_type); >>>> + /* Special case for the last leaf with no sibling. */ >>>> + if (j == n) >>>> + i = j; >>>> >>>> - /* Swapping with j may have introduced a violation at j. Fix >>>> - it in the next loop iteration. */ >>>> - k = j; >>>> + /* Backtrack to the correct location. */ >>>> + while (i != k && cmp (base + k * size, base + i * size, arg) > 0) >>>> + i = (i - 1) / 2; >>>> + >>>> + /* Shift the element into its correct place. */ >>>> + j = i; >>>> + while (i != k) >>>> + { >>>> + i = (i - 1) / 2; >>>> + do_swap (base + i * size, base + j * size, size, swap_type); >>>> } >>>> } >>>> >>>> -- >>>> 2.25.1 >>>> >>> >>> Since we still retain heapsort as a fallback for mergesort, should we >>> reconsider applying this patch? >>> >>> Thanks, >>> Kuan-Wei >> >> Yes, I think it is work and it looks great to me. > > Should I resend a new patch, or can you directly apply this one? It applies as-is, so it should be ok. I will run some tests and apply, thanks. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-03-27 20:42 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-12-02 21:48 [PATCH] stdlib: Optimize number of calls to comparison function Kuan-Wei Chiu 2023-12-04 8:20 ` Florian Weimer 2023-12-04 18:31 ` Kuan-Wei Chiu 2023-12-05 10:44 ` Florian Weimer 2023-12-05 20:00 ` Kuan-Wei Chiu 2023-12-05 20:35 ` Zack Weinberg 2023-12-06 10:21 ` Florian Weimer 2023-12-06 12:51 ` Adhemerval Zanella Netto 2023-12-17 15:42 ` Zack Weinberg 2023-12-17 15:55 ` Florian Weimer 2023-12-17 16:47 ` Kuan-Wei Chiu 2023-12-17 18:02 ` Florian Weimer 2023-12-05 3:22 ` [PATCH v2 0/2] stdlib: Optimize number of calls to comparison function in qsort Kuan-Wei Chiu 2023-12-05 3:22 ` [PATCH v2 1/2] " Kuan-Wei Chiu 2023-12-05 3:22 ` [PATCH v2 2/2] stdlib: Adjust the factor in tst-qsort5 Kuan-Wei Chiu 2024-02-16 7:08 ` [PATCH] stdlib: Optimize number of calls to comparison function Kuan-Wei Chiu 2024-03-27 19:45 ` Adhemerval Zanella Netto 2024-03-27 19:59 ` Kuan-Wei Chiu 2024-03-27 20:42 ` Adhemerval Zanella Netto
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).