From 30259d4588c0b587624afb55daa86fe469fe2d8e Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sun, 4 Feb 2024 16:53:22 -0800 Subject: [PATCH] Fix bsearch, qsort etc. doc to match POSIX better MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * manual/search.texi (Array Search Function): Correct the statement about lfind’s mean runtime: it is proportional to a number (not that number), and this is true only if random elements are searched for. Relax the constraint on bsearch’s array argument: POSIX says it need not be sorted, only partially sorted. Say that the first arg passed to bsearch’s comparison function is the key, and the second arg is an array element, as POSIX requires. For bsearch and qsort, say that the comparison function should not alter the array, as POSIX requires. For qsort, say that the comparison function must define a total order, as POSIX requires. Add a warning that the comparison function should not rely on object addresses. Clarify how to ensure a stable sort, usage of temporary storage, and what algorithmic properties qsort may have. --- manual/search.texi | 63 ++++++++++++++++++++++++++++++---------------- 1 file changed, 41 insertions(+), 22 deletions(-) diff --git a/manual/search.texi b/manual/search.texi index db577a5332..4f373d21ca 100644 --- a/manual/search.texi +++ b/manual/search.texi @@ -84,8 +84,9 @@ The return value is a pointer to the matching element in the array starting at @var{base} if it is found. If no matching element is available @code{NULL} is returned. -The mean runtime of this function is @code{*@var{nmemb}}/2. This -function should only be used if elements often get added to or deleted from +The mean runtime of this function is proportional to @code{*@var{nmemb}/2}, +assuming random elements of the array are searched for. This +function should be used only if elements often get added to or deleted from the array in which case it might not be useful to sort the array before searching. @end deftypefun @@ -122,24 +123,31 @@ bytes. If one is sure the element is in the array it is better to use calling @code{lsearch}. @end deftypefun -To search a sorted array for an element matching the key, use the -@code{bsearch} function. The prototype for this function is in +To search a sorted or partially sorted array for an element matching the key, +use the @code{bsearch} function. The prototype for this function is in the header file @file{stdlib.h}. @pindex stdlib.h @deftypefun {void *} bsearch (const void *@var{key}, const void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare}) @standards{ISO, stdlib.h} @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} -The @code{bsearch} function searches the sorted array @var{array} for an object +The @code{bsearch} function searches @var{array} for an object that is equivalent to @var{key}. The array contains @var{count} elements, each of which is of size @var{size} bytes. The @var{compare} function is used to perform the comparison. This -function is called with two pointer arguments and should return an +function is called with arguments that point to the key and to an +array element, in that order, and should return an integer less than, equal to, or greater than zero corresponding to -whether its first argument is considered less than, equal to, or greater -than its second argument. The elements of the @var{array} must already -be sorted in ascending order according to this comparison function. +whether the key is considered less than, equal to, or greater than +the array element. The function should not alter the array's contents. + +@var{array} does not need to be completely sorted according to +@var{compare}, but it does need to be partially sorted with respect to +@var{key}. That is, the array must begin with all the elements that +compare less than @var{key}; next must come all the elements that +compare equal to @var{key}; and last, all the elements that compare +greater than @var{key}. (Any or all of these sub-sequences may be empty.) The return value is a pointer to the matching array element, or a null pointer if no match is found. If the array contains more than one element @@ -170,21 +178,33 @@ The @var{compare} function is used to perform the comparison on the array elements. This function is called with two pointer arguments and should return an integer less than, equal to, or greater than zero corresponding to whether its first argument is considered less than, -equal to, or greater than its second argument. +equal to, or greater than its second argument. The function must +not alter the array's contents, and must define a total ordering +of all the array elements, including any unusual values such as +floating-point NaN (@pxref{Infinity and NaN}) that are present. + +@code{qsort} may attempt to allocate large amounts of temporary +storage, using @code{malloc}. However, memory allocation failure will +not prevent it from sorting the array. @cindex stable sorting @strong{Warning:} If two objects compare as equal, their order after sorting is unpredictable. That is to say, the sorting is not stable. This can make a difference when the comparison considers only part of the elements. Two elements with the same sort key may differ in other -respects. +respects. The only way to ensure a stable sort, when @var{compare} +does not consider all of the data in the objects being sorted, is to +augment each object with a tie-breaking value, such as its original +array index. -Although the object addresses passed to the comparison function lie -within the array, they need not correspond with the original locations -of those objects because the sorting algorithm may swap around objects -in the array before making some comparisons. The only way to perform -a stable sort with @code{qsort} is to first augment the objects with a -monotonic counter of some kind. +@strong{Warning:} The result of @var{compare} should not depend in +any way on the @emph{addresses} of the objects it is comparing. +ISO C requires that both addresses passed to the comparison function +always lie within the original array, and @theglibc{} honors this +requirement, but other C libraries might not. More importantly, the +sorting process may temporarily move objects out of order, so the +relative positions of objects within the array are meaningless while +@code{qsort} is running. Here is a simple example of sorting an array of @code{long int} in numerical order, using the comparison function defined above (@pxref{Comparison @@ -200,11 +220,10 @@ Functions}): @end smallexample The @code{qsort} function derives its name from the fact that it was -originally implemented using the ``quick sort'' algorithm. - -The implementation of @code{qsort} attempts to allocate auxiliary storage -and use the merge sort algorithm, without violating C standard requirement -that arguments passed to the comparison function point within the array. +originally implemented using the ``quick sort'' algorithm. Modern C +libraries (including @theglibc{}) may or may not use this algorithm. +However, you can rely on @code{qsort} to be asymptotically optimal in +the average case (i.e.@: @math{O(n \log n)}). @end deftypefun @node Search/Sort Example -- 2.43.0