From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 60094 invoked by alias); 9 Aug 2017 11:03:39 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 60076 invoked by uid 89); 9 Aug 2017 11:03:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=no version=3.3.2 spammy=complaining, coordinated, claims, website X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 09 Aug 2017 11:03:36 +0000 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B3A306128B for ; Wed, 9 Aug 2017 11:03:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B3A306128B Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=eblake@redhat.com Received: from [10.10.120.43] (ovpn-120-43.rdu2.redhat.com [10.10.120.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id 55839702D7 for ; Wed, 9 Aug 2017 11:03:35 +0000 (UTC) Subject: Re: gawk 4.1.4: CR separate char for CRLF files To: cygwin@cygwin.com References: <004401d3109c$2dcb09e0$89611da0$@gmx.net> <598a47fc.5501ca0a.5476f.0305@mx.google.com> <004701d310a9$372363e0$a56a2ba0$@gmx.net> <001001d310ea$ceeee230$6ccca690$@gmx.net> From: Eric Blake Openpgp: url=http://people.redhat.com/eblake/eblake.gpg Message-ID: <391b0ca2-e495-a908-160a-6d95492f526f@redhat.com> Date: Wed, 09 Aug 2017 11:03:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <001001d310ea$ceeee230$6ccca690$@gmx.net> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="ivNsrFqKJrSFhLeSQhaD5ATs3lHxdcMcx" X-IsSubscribed: yes X-SW-Source: 2017-08/txt/msg00096.txt.bz2 --ivNsrFqKJrSFhLeSQhaD5ATs3lHxdcMcx Content-Type: multipart/mixed; boundary="OGEGRI9r5gcQAJ9PXR73Gn53pLUCJ4P4J"; protected-headers="v1" From: Eric Blake To: cygwin@cygwin.com Message-ID: <391b0ca2-e495-a908-160a-6d95492f526f@redhat.com> Subject: Re: gawk 4.1.4: CR separate char for CRLF files References: <004401d3109c$2dcb09e0$89611da0$@gmx.net> <598a47fc.5501ca0a.5476f.0305@mx.google.com> <004701d310a9$372363e0$a56a2ba0$@gmx.net> <001001d310ea$ceeee230$6ccca690$@gmx.net> In-Reply-To: <001001d310ea$ceeee230$6ccca690$@gmx.net> --OGEGRI9r5gcQAJ9PXR73Gn53pLUCJ4P4J Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Content-length: 1900 On 08/09/2017 03:37 AM, Jannick wrote: > Which is a pretty much of a pain when there is no easy fallback solution > provided in case a major change is applied. E.g. for sed - if I understand > the reference to sed in https://cygwin.com/ml/cygwin/2017-08/msg00033.html > correctly - a separate switch '-b' is added. Incorrect. 'sed -b' has always existed, but did NOT do what you wanted (it forced CR to be treated as a separate character; where what you want is to ignore CR if it appears before LF). In fact, the coordinated change made back in February to all of grep, sed, and awk, was that all three programs now default to what used to be possible only through 'sed -b', because silently stripping CR can corrupt data when you are not expecting it, while requiring the user to explicitly strip CR when they know they are working with CRLF line endings is less magic (fewer downstream patches, and more obvious in looking at a script that the script knows what it is doing). If your data lives on a text mount (instead of a binary mount), then you still get CR stripping for free. If your data comes from a pipeline rather than the file system, then you can add a d2u or other CR-stripping tool in the pipeline. > This is - to say the least - unpleasant in the light of what Cygwin claims > to be, namely 'a large collection of GNU and Open Source tools which prov= ide > functionality similar to a Linux distribution on Windows' (from the top of > the start website www.cygwin.com). On Linux, nothing strips CR automatically. So on Cygwin, we behave the same - nothing strips CR automatically on binary mounted data. And the fact that the change was made AND ANNOUNCED back in February, but you are now only 6 months later complaining about it, is telling. --=20 Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org --OGEGRI9r5gcQAJ9PXR73Gn53pLUCJ4P4J-- --ivNsrFqKJrSFhLeSQhaD5ATs3lHxdcMcx Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" Content-length: 619 -----BEGIN PGP SIGNATURE----- Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEzBAEBCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAlmK7AUACgkQp6FrSiUn Q2os0Qf/b6gwTseRCG1Cz49iqhabbdPz9gd2YTwvXQRwdge6QR+o9+Qyc0OFrLg1 jwV9ESZ0gU07Qv4l2wpHdKFzLlMV+I/mdAd7Fmb03TBtgiAmzuFf++dYxRGHMI1k qN7v3TdmZKtmdfpKhxxz/jlV8J3HhTkq2YgSvT74jVYg3ozOYQ8c8p+IKFRVS/VV jVvMaST9xHBhYynz4dN451JuIu/O5pvngjE/gcFb018VbknP+w88Wat7AYsxCerA PGJ9byAe1vgohkQfZuMDi1DVlxsJeRK2XFYrLL4GQmDMH9KFh+qJ71FWD0dM1i27 PmWM5DhCHCu2rUhVE+N8AQLc3ISxTQ== =pXp/ -----END PGP SIGNATURE----- --ivNsrFqKJrSFhLeSQhaD5ATs3lHxdcMcx--