public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "sjhowe at dial dot pipex dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug libstdc++/35968] nth_element fails to meet its complexity requirements
Date: Mon, 21 Apr 2008 18:52:00 -0000	[thread overview]
Message-ID: <20080421185144.5705.qmail@sourceware.org> (raw)
In-Reply-To: <bug-35968-16076@http.gcc.gnu.org/bugzilla/>



------- Comment #4 from sjhowe at dial dot pipex dot com  2008-04-21 18:51 -------
Yes. You want a partition that is O(1) that each time round eliminates N/2
elements (bearing in mind the post-condition for nth_element where iterators
greater than the kth iterator have elements that are >= the kth element and
iterators less than the kth iterator have elements that are <= the kth element)
So median-of-3 or for large N is a must. And this is run for 3/2 * log2(N)
steps.

If it has not converged by end of steps (so it has been a bad run) or new N is
< some constant (making binary insertion sort worthwhile) then at that point
you want the cheapest approximate median algorithm that is _guaranteed_ O(N).
The algorithm is still O(N) as the choosing median and partitioning is occuring
in serial. In this case, it is minimising the constant factor that matters. 
The median-of-median of 5 is well known
But this approximate median is less well known.
So it is the combination of
   (i) guaranteed O(N) partitioning
   (ii) cheapest constant factor (so minimising iterator traversal, copy ctor,
swapping, comparing etc)
I have not yet checked median of median-of-5 against this, but I believe (ii)
works out cheaper.
And if it works that there exists an even cheaper guranteed O(N) partitioning
that finds an approximate median, gcc should use that. It does not matter if
the exact median is found each time round just as long as N/2 elements are
reduced each time round. 

I have also been alerted to a intraselect variation of nth_element() that looks
promising. It is this:
   If you wanted the iterator that was 20% in from the start and you sampled
elements to choose a partition element, instead of choosing an approximate
median, choose the element that is 20%. So if the sample was 5 elements, choose
the 2nd element. Now if partitioning was lop-sided but you reduced the number
of elements drastically leaving a tiny amount as candidates, you have massively
reduced the problem. This is brand new research in nth_element but I have not
yet seen much analysis.

Stephen Howe


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968


  parent reply	other threads:[~2008-04-21 18:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-17 20:39 [Bug libstdc++/35968] New: " sjhowe at dial dot pipex dot com
2008-04-20 17:31 ` [Bug libstdc++/35968] " pcarlini at suse dot de
2008-04-21  3:23 ` roger at eyesopen dot com
2008-04-21  8:25 ` pcarlini at suse dot de
2008-04-21 18:52 ` sjhowe at dial dot pipex dot com [this message]
2008-04-24 15:02 ` roger at eyesopen dot com
2008-04-24 23:40 ` bkoz at gcc dot gnu dot org
2008-04-28 20:18 ` sjhowe at dial dot pipex dot com
2009-12-15 17:19 ` paolo dot carlini at oracle dot com
2010-01-28 17:17 ` paolo dot carlini at oracle dot com
     [not found] <bug-35968-4@http.gcc.gnu.org/bugzilla/>
2015-05-14  0:22 ` lopresti at gmail dot com
2020-04-05 21:56 ` andersk at mit dot edu
2020-04-06 15:53 ` lopresti at gmail dot com
2020-04-06 18:32 ` andersk at mit dot edu
2020-06-19 15:35 ` sjhowe at dial dot pipex.com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080421185144.5705.qmail@sourceware.org \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).