public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: cygwin@cygwin.com
Subject: Re: [ANNOUNCEMENT] Updated [test]: sed-4.4-1
Date: Mon, 13 Feb 2017 19:15:00 -0000	[thread overview]
Message-ID: <5f509f48-5da7-45f8-fc9e-edc8e541724c@redhat.com> (raw)
In-Reply-To: <20170212113222.GF11666@calimero.vinschen.de>


[-- Attachment #1.1: Type: text/plain, Size: 1881 bytes --]

On 02/12/2017 05:32 AM, Corinna Vinschen wrote:
> I understand the desire but it's s a pretty tricky problem.  awk is
> used to manipulate text input in the first place so it treats all
> input, files as well as stdin, as text.  So, shall we drop this
> behaviour for files only?  Or for stdin as well?  How many existing
> setups are bound to fail after a change?

I think part of the confusion is that POSIX states that awk behavior is
only well-defined on "text files" - but that is the POSIX definition of
a text file (no invalid characters in multibyte encoding, no over-long
lines, no NUL bytes, trailing newline), and not strictly related to the
Windows definition of text file (one with CRLF line endings).  But
remember, just because POSIX says that awk is only required to be
well-behaved on text files does not mean that awk cannot be usefully
used on non-text files, and anything we do that silently converts binary
data into corrupted text, when a binary mount was requested, gets in the
way of that usage pattern.

As long as we aren't using fopen("rb") to force binary mode, but rather
just fopen("r") to let the mount mode rule, we should be okay for any
file that we open.  As for stdin, ideally stdin is either from a file
(where the shell opened it according to mount mode) or from a pipeline
(where presumably the other end of the pipe opened the file in the
correct mount mode, or where the user can inject a d2u into the pipeline
if they want CR stripped).

Yes, it means that any existing users that were lazily relying on the
forced text mode to automatically strip CRs will now have to fix their
scripts to add a d2u invocation, but I already hit some of that fallout
when I changed bash to quit forcing text mode.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

  parent reply	other threads:[~2017-02-13 19:15 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-11 17:20 Eric Blake (cygwin)
2017-02-11 23:01 ` Steven Penny
2017-02-12 11:32   ` Corinna Vinschen
2017-02-12 15:13     ` Steven Penny
2017-02-13 14:15       ` Nellis, Kenneth (Conduent)
2017-02-13 15:53         ` cyg Simple
2017-02-13 19:07           ` Eric Blake
2017-02-13 19:15     ` Eric Blake [this message]
2017-02-16  5:05       ` Steven Penny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f509f48-5da7-45f8-fc9e-edc8e541724c@redhat.com \
    --to=eblake@redhat.com \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).