public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca>
To: cygwin@cygwin.com
Subject: Re: gawk Regression: CR characters are not stripped on Windows
Date: Tue, 27 Feb 2018 15:03:00 -0000	[thread overview]
Message-ID: <c281b8e2-1019-ac28-466a-17cce731457c@SystematicSw.ab.ca> (raw)
In-Reply-To: <CAGHpTB+bfbts=fOBSQPN7c-NDh8FTXR+EauhDhiVrqbgawcYoA@mail.gmail.com>

On 2018-02-27 00:22, Orgad Shaneh wrote:
> Cross-posting per Eli Zaretskii's request.
> CR characters used to be automatically stripped on Windows (MSYS2 and
> Cygwin environments). This is broken in 4.2.0.

Cygwin binary mounts treat files as on Unix.

You missed all the discussions in early 2017 about gawk, grep, sed EOL handling:

https://sourceware.org/ml/cygwin/2017-02/msg00152.html
https://sourceware.org/ml/cygwin/2017-02/msg00188.html
https://sourceware.org/ml/cygwin/2017-02/msg00189.html

following on from discussions about bash after ShellShock:

https://sourceware.org/ml/cygwin/2016-08/msg00097.html

> Minimal example:
> echo -en "foo\r\n\r\nbar\r\n" > foo.txt
> awk '/^$/ { print "found" }' foo.txt # This worked with 4.1.4 and
> doesn't work with 4.2.0
> awk '/^\r$/ { print "found" }' foo.txt # This works with 4.2.0 and
> doesn't work with 4.1.4
>> Under MS-Windows, 'gawk' (and many other text programs) silently
>> translates end-of-line '\r\n' to '\n' on input and '\n' to '\r\n' on
>> output.

Cygwin does not try to be an MS Windows environment.
Cygwin tries its best to be a POSIX/Unix/Linux environment.

> and on Feb 8 the following section was added:
>> Recent versions of Cygwin open all files in binary mode.  This means
>> that you should use 'RS = "\r?\n"' in order to be able to handle
>> standard MS-Windows text files with carriage-return plus line-feed line
>> endings.

Use DOS files from a Cygwin text mount which does the conversion.

> This breaks compatibility between different gawk versions. What were
> the reasons for this change in cygwin, and why was it pushed upstream?

Compatibility with POSIX/Unix/Linux systems, except on a text mount, to allow
scripts which deal with binary data or embedded \r to work correctly, and
require scripts which work correctly, on Windows or Unix text as the application
provides, prefers, or ignores, and under Unix/Cygwin/Msys/Mingw.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

  parent reply	other threads:[~2018-02-27 15:03 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-27  7:22 Orgad Shaneh
2018-02-27  9:50 ` Andrey Repin
2018-02-27 10:13   ` Orgad Shaneh
2018-02-27 12:55     ` Steven Penny
2018-02-27 11:09 ` Houder
2018-02-27 15:03 ` Brian Inglis [this message]
2018-02-27 16:56 ` Eric Blake
2018-03-05 13:36 ` [bug-gawk] " arnold
2018-03-05 14:00   ` Corinna Vinschen
2018-03-05 14:23     ` arnold
2018-03-05 14:43       ` arnold
2018-03-05 21:54       ` Andrey Repin
2018-03-06  0:33         ` Vince Rice
2018-03-06  4:42         ` arnold

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c281b8e2-1019-ac28-466a-17cce731457c@SystematicSw.ab.ca \
    --to=brian.inglis@systematicsw.ab.ca \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).