From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by sourceware.org (Postfix) with ESMTPS id 11B123858D20 for ; Sat, 13 Jan 2024 17:25:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 11B123858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 11B123858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1030 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705166754; cv=none; b=gZD3U48+E1hQ/HStlTKLOWItFrFNZffzdDIB0TNYiwCOlKtU+KfiqoVuBxOQkezePk57zR+FshQE/lEKcQ/0rUW+Rzj8hkZkItINEWidIG58kCwsxj7Vjf/ORUfQpQQdSdTLMYUCMjgFunJOxe+rhvs5Ib04pJKhYMwMRvtAyjc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705166754; c=relaxed/simple; bh=igeNDqB2sd1qDrLKsCW9uZlZJZhcvrNVAK7PpypoJfI=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=Brohvt+fcdEmy8NZQfa6W2b8DRrg4eaFycXTBY1z9V39V8CpVmKExrunBi40jVFP84jiZojBg42oGQXjAETXwzuiP7sJ3aSd5K4jaYC1/9qzyVjnJEPJTuAgAIeNotCplgO0B/QZu/YiQMiUf5HMVE6QWzPgoCgd8decsFRD/xA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pj1-x1030.google.com with SMTP id 98e67ed59e1d1-28e141e677cso236439a91.0 for ; Sat, 13 Jan 2024 09:25:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705166750; x=1705771550; darn=sourceware.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:to:from:date:from:to:cc :subject:date:message-id:reply-to; bh=PdfsLhGR4E2zGfFYc84FWiC1SSBSI+NdtV9smviGbnI=; b=DqC2Bc8lbh0FjrOH+T0HvUtd6OflZQ3+aBMl3ZX3PTjGL7184moILUnyakNAoCSwo1 4AAKXYVnMBXmZGUvyMc7d9axjyHTtSAHkRhg6WdPIJucRYezTvDe8eSbvMVsU4snJ7Vc nDQOUIB/hhDdTjBI1clopA/uQ36cZfs7thGqYzq0RRCWs+M6ya2Qz4hdzX+6mVpCVRGm e+N/nMs0usHtFPsfwcuL2wyCzJjrvj3/5iiu4DK/puOzH6XiVNhWKJqRmSIrrTo2gKCe 4/piuSmeVCjXspI+2dR10+bWpFWu8ZSrj+jLfqaScL5F+5aU1SdIC9dCnLhltqtrcbp3 m3Xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705166750; x=1705771550; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PdfsLhGR4E2zGfFYc84FWiC1SSBSI+NdtV9smviGbnI=; b=V988T5nNTcHd9P/OQvkjxi8zry54VTXeOv91pN2uEzJoyJRRuTEbJCVraaxtaioDf1 4NNEHj02Sev0/RDniDXubBiYNojx2P+RviSGaIcxHZ4WS60zV2MuQwxav4DhXioUSynK VZhEJMybrJW9LiTJUNKuoZwnXuSkpTqdYqjw1+IT44Sngknkhur0L0a8PZwqo48Xg3ui PUuhGchVb2IhfdqCLLmF9xTstyiEHqgiA8haM+6RX5PA84As8/HEVmrK/Ct7XQIFsVxA Xy2PjufHlQUumwWeyzDmScSSSGtdfmVT0MTn14nwHQ91UT4ZnmP0BKmxJkxeS74jZwUf mC3Q== X-Gm-Message-State: AOJu0YxWw5JyUdAFXgiagkR6iPlXKn0KXpE+783kE5RoLINOr0JjRfeP eZLxhoX4lHIDNecbtNDL0prdaBEjROc= X-Google-Smtp-Source: AGHT+IFamIku68iKAqT2CrVzk8wj1YEX3U54m74d5u86g2e5njLSazsAYhudwry8cFRQeTpUogfVmg== X-Received: by 2002:a17:90b:1e44:b0:28c:ef1a:db19 with SMTP id pi4-20020a17090b1e4400b0028cef1adb19mr5310668pjb.0.1705166749874; Sat, 13 Jan 2024 09:25:49 -0800 (PST) Received: from visitorckw-System-Product-Name (IP-216-168.cs.nctu.edu.tw. [140.113.216.168]) by smtp.gmail.com with ESMTPSA id sg9-20020a17090b520900b0028d9fc97c29sm6518142pjb.14.2024.01.13.09.25.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Jan 2024 09:25:49 -0800 (PST) Date: Sun, 14 Jan 2024 01:25:46 +0800 From: Kuan-Wei Chiu To: newlib@sourceware.org, brian.inglis@systematicsw.ab.ca Subject: Re: [PATCH] Implement introsort for qsort Message-ID: References: <20231230195940.1989559-1-visitorckw@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20231230195940.1989559-1-visitorckw@gmail.com> X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Dec 31, 2023 at 03:59:40AM +0800, Kuan-Wei Chiu wrote: > Enhances the qsort implementation by introducing introsort to address > the worst-case time complexity of O(n^2) associated with quicksort. > Introsort is utilized to switch to heapsort when quicksort recursion > depth becomes excessive, ensuring a worst-case time complexity of > O(n log n). > > The heapsort implementation adopts a bottom-up approach, significantly > reducing the number of required comparisons and enhancing overall > efficiency. > > Refs: > Introspective Sorting and Selection Algorithms > David R. Musser > Software—Practice & Experience, 27(8); Pages 983–993, Aug 1997 > https://dl.acm.org/doi/10.5555/261387.261395 > > A killer adversary for quicksort > M. D. McIlroy > Software—Practice & Experience, 29(4); Pages 341–344, 10 April 1999 > https://dl.acm.org/doi/10.5555/311868.311871 > > BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating, on an average, > QUICKSORT (if n is not very small) > Ingo Wegener > Theoretical Computer Science, 118(1); Pages 81-98, 13 September 1993 > https://dl.acm.org/doi/10.5555/162625.162643 > > Signed-off-by: Kuan-Wei Chiu > --- > To assess the performance of the new introsort and the old quicksort in > worst-case scenarios, we examined the number of comparisons required > for sorting based on the paper "A killer adversary for quicksort." > > Before the patch: > n = 1000, cmp_count = 100853 > n = 2000, cmp_count = 395991 > n = 3000, cmp_count = 885617 > n = 4000, cmp_count = 1569692 > n = 5000, cmp_count = 2448183 > n = 6000, cmp_count = 3521140 > n = 7000, cmp_count = 4788493 > n = 8000, cmp_count = 6250342 > n = 9000, cmp_count = 7906610 > n = 10000, cmp_count = 9757353 > > After the patch: > n = 1000, cmp_count = 27094 > n = 2000, cmp_count = 61500 > n = 3000, cmp_count = 100529 > n = 4000, cmp_count = 136538 > n = 5000, cmp_count = 182698 > n = 6000, cmp_count = 221120 > n = 7000, cmp_count = 259933 > n = 8000, cmp_count = 299058 > n = 9000, cmp_count = 356405 > n = 10000, cmp_count = 397723 > > newlib/libc/search/qsort.c | 153 +++++++++++++++++++++++++++++++++++-- > 1 file changed, 147 insertions(+), 6 deletions(-) Hi Brian, Thank you for your feedback, and I apologize for the late response. I found your reply in the archives of the newlib mailing list on the web interface. However, for some reason, I did not find the same email in my email client. Upon reviewing the previously mentioned issue, it seems that although both instances lead to a worst-case degradation to O(n^2), the causes appear to be different. One may be the adoption of insertion sort at an inappropriate time, causing degradation, while the other could be adversarial input leading to the selection of an unsuitable pivot. I can delve deeper into this issue later, but I believe it should be different patches. I looked at the code at BSD implementation [1], but I am skeptical that it would address the problem of adversarial input degrading to O(n^2). I will conduct experiments in subsequent emails to verify this. Thank you for reminding me to consider not only the number of comparisons but also the number of swaps. Indeed, switching to heapsort may likely increase the required number of swaps, but while the number of swaps would be at an O(n log n) level, the number of comparisons would be at an O(n^2) level. Therefore, a trade-off needs to be made, and I will conduct further experiments. If increasing the number of swaps does lead to degradation in efficiency, it would be better to drop this patch. Regarding the experimental data, as I am not a native English speaker, I would like to inquire about the meaning of "two values out of order repeated to fill the array, in alternating rows and alternate halves." This clarification will help me avoid any misunderstanding of the text. I will post new experimental data within the next few days. Once again, thank you for your feedback and suggestions! Best regards, Kuan-Wei Chiu Link: https://cgit.freebsd.org/src/tree/lib/libc/stdlib/qsort.c [1]