public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Inconsistency with sort -n?
@ 2008-12-31 21:39 Buchbinder, Barry (NIH/NIAID) [E]
  2009-01-02 23:07 ` Dave Korn
  2009-01-07 14:27 ` Eric Blake
  0 siblings, 2 replies; 3+ messages in thread
From: Buchbinder, Barry (NIH/NIAID) [E] @ 2008-12-31 21:39 UTC (permalink / raw)
  To: cygwin

`sort -n' and `sort -g' work inconsistently with 0 and -0 if there are leading spaces.  Sometimes -0 is before 0, as I would expect, and sometimes it is afterwards.  Adding `-b' does not seem to help.

Is this where I should report it or should I go upstream?

In case you are wondering why I want to do this:  I'm counting items in a bin so the bin from -1 to 0 and 0 to +1 are different.

Please excuse the excessive number of examples.

Happy Gregorian New Year!

- Barry
  Disclaimer: Statements made herein are not made on behalf of NIAID.
________________

$ echo 0 -1 -0 1 | tr ' ' '\n' | sort -n
-1
-0
0
1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e 's/^/ /' | sort -n
 -1
 -0
 0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/s/^/ /' | sort -n
 -1
 -0
0
1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -n
-1
 0
-0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/s/^/ /' -e 's/^/ /' | sort -n
  -1
  -0
 0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' -e 's/^/ /' | sort -n
 -1
  0
 -0
  1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/s/^/    /' -e 's/^/ /' | sort -n
     -1
     -0
 0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/!s/^/    /' -e 's/^/ /' | sort -n
 -1
     0
 -0
     1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sort -g
-1
-0
0
1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e 's/^/ /' | sort -g
 -1
 -0
 0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/s/^/ /' | sort -g
 -1
 -0
0
1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -g
-1
 0
-0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/s/^/ /' -e 's/^/ /' | sort -g
  -1
  -0
 0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' -e 's/^/ /' | sort -g
 -1
  0
 -0
  1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/s/^/    /' -e 's/^/ /' | sort -g
     -1
     -0
 0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/!s/^/    /' -e 's/^/ /' | sort -g
 -1
     0
 -0
     1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sort -nb
-1
-0
0
1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e 's/^/ /' | sort -nb
 -1
 -0
 0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/s/^/ /' | sort -nb
 -1
 -0
0
1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -nb
-1
 0
-0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/s/^/ /' -e 's/^/ /' | sort -nb
  -1
  -0
 0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' -e 's/^/ /' | sort -nb
 -1
  0
 -0
  1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/s/^/    /' -e 's/^/ /' | sort -nb
     -1
     -0
 0
 1
$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/-/!s/^/    /' -e 's/^/ /' | sort -nb
 -1
     0
 -0
     1

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Inconsistency with sort -n?
  2008-12-31 21:39 Inconsistency with sort -n? Buchbinder, Barry (NIH/NIAID) [E]
@ 2009-01-02 23:07 ` Dave Korn
  2009-01-07 14:27 ` Eric Blake
  1 sibling, 0 replies; 3+ messages in thread
From: Dave Korn @ 2009-01-02 23:07 UTC (permalink / raw)
  To: cygwin

Buchbinder, Barry (NIH/NIAID) [E] wrote:
> `sort -n' and `sort -g' work inconsistently with 0 and -0 if there are
> leading spaces.  Sometimes -0 is before 0, as I would expect, and sometimes
> it is afterwards.  Adding `-b' does not seem to help.
>
> Is this where I should report it or should I go upstream?

  Kinda depends where it's coming from. Could be newlib, could be cygwin,
could be sort itself. Either the + and - zeros aren't being correctly
converted to their float representations, or the comparison of + vs. - zero
isn't working right, at a first guess.

> In case you are wondering why I want to do this:  I'm counting items in a
> bin so the bin from -1 to 0 and 0 to +1 are different.

  Hacky work-around: " | sort -r | sort [-n|-g] -s". First alphabetic sort
using -r gets all the negative numbers at the start of the list, then adding
stable flag to the numeric sort preserves their relative ordering when they
compare equal.

> Happy Gregorian New Year!

  Happy Pastafarian Noodly YARRRR!

    cheers,
      DaveK

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Inconsistency with sort -n?
  2008-12-31 21:39 Inconsistency with sort -n? Buchbinder, Barry (NIH/NIAID) [E]
  2009-01-02 23:07 ` Dave Korn
@ 2009-01-07 14:27 ` Eric Blake
  1 sibling, 0 replies; 3+ messages in thread
From: Eric Blake @ 2009-01-07 14:27 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1694 bytes --]

According to Buchbinder, Barry (NIH/NIAID) [E] on 12/31/2008 2:29 PM:

[sorry for my delay in replying]

> `sort -n' and `sort -g' work inconsistently with 0 and -0 if there are leading spaces.  Sometimes -0 is before 0, as I would expect, and sometimes it is afterwards.  Adding `-b' does not seem to help.
> 
> Is this where I should report it or should I go upstream?

If it were a bug, it would be an upstream issue (I reproduced your test
cases on Linux).  But it is not a bug; sort is behaving as documented.

> $ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -n
> -1
>  0
> -0
>  1

sort -n sorts the entire line based on numeric value (0 and -0 have the
same value), then breaks ties based on byte-wise values (' ' comes before
'-').

> $ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -g
> -1
>  0
> -0
>  1

sort -g is slower than sort -n, because it converts to floating point; and
although -0.0 and +0.0 are distinct bit patterns, they still sort equal,
so you ware once again back to the fallback of bytewise comparison to
break ties (and ' ' still comes before '-').

Use sort -u to see that 0 and -0 sort numerically equal, and thus why a
fallback sort must be attempted.

$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -nu
-1
 0
 1

Or, go one better - use two sort keys.  Make the primary key sort
numerically, and the second sort key break ties in favor of '-':

$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -k1,1n -k1r
-1
-0
 0
 1

-- 
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9@byu.net
volunteer cygwin coreutils maintainer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 319 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-01-07 13:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-31 21:39 Inconsistency with sort -n? Buchbinder, Barry (NIH/NIAID) [E]
2009-01-02 23:07 ` Dave Korn
2009-01-07 14:27 ` Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).