public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Corinna Vinschen <corinna-cygwin@cygwin.com>
To: cygwin@cygwin.com
Subject: Re: [BUG REPORT]sed -e 's/[B-D]/_/g' replaces unexpected characters
Date: Tue, 25 Jun 2013 19:44:00 -0000	[thread overview]
Message-ID: <20130625160911.GC14459@calimero.vinschen.de> (raw)
In-Reply-To: <20130625160359.GB14459@calimero.vinschen.de>

On Jun 25 18:03, Corinna Vinschen wrote:
> On Jun 25 15:38, Lavrentiev, Anton (NIH/NLM/NCBI) [C] wrote:
> > > Your locale is zh_CN.UTF-8.  What you're expecting is only guaranteed
> > > in the C locale:
> > 
> > I'm not quite sure it applies here.  I'm using US English Windows 7.
> > 
> > LANG = 'en_US.UTF-8'
> > 
> > I get the same result:
> > 
> > $ echo abcdeABCDE | sed -e 's/[B-D]/_/g'
> > ab__eA___E
> > 
> > BUT:
> > 
> > $ echo abcdeABCDE | LANG=C sed 's/[B-D]/_/g'
> > abcdeA___E
> > 
> > This is very weird, indeed.
> > 
> > OTOH, in Linux I have the same LANG setup, yet it does work
> > correctly:
> > 
> > > echo $LANG
> > en_US.UTF-8
> > > echo abcdeABCDE | sed -e 's/[B-D]/_/g'
> > abcdeA___E
> > 
> > I believe that an en_US UTF-8 string representation for
> > "abcdeABCDE" is not any different from ASCII.
> 
> Wrong.  Try this:
> 
>   $ sort
>   a
>   b
>   c
>   d
>   e
>   A
>   B
>   C
>   D
>   E
>   <Ctrl-D>
>   a
>   A
>   b
>   B
>   c
>   C
>   d
>   D

Which also means, AFAICS, Cygwin's sed is doing it right, Linux' sed
is doing it wrong.  Yes, that puzzles me a bit at the moment, too.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

  reply	other threads:[~2013-06-25 16:09 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-25 14:52 Atry
2013-06-25 15:38 ` Corinna Vinschen
2013-06-25 15:46   ` Lavrentiev, Anton (NIH/NLM/NCBI) [C]
2013-06-25 16:07     ` Corinna Vinschen
2013-06-25 19:44       ` Corinna Vinschen [this message]
2013-06-26  9:54         ` Corinna Vinschen
2013-06-25 15:57   ` Lavrentiev, Anton (NIH/NLM/NCBI) [C]
2013-06-25 16:09     ` Buchbinder, Barry (NIH/NIAID) [E]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130625160911.GC14459@calimero.vinschen.de \
    --to=corinna-cygwin@cygwin.com \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).