From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 73678 invoked by alias); 15 Aug 2017 15:41:08 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 73100 invoked by uid 89); 15 Aug 2017 15:40:46 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=upgrades, H*Ad:D*at, vote, endings X-HELO: mout.gmx.net Received: from mout.gmx.net (HELO mout.gmx.net) (212.227.15.18) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 15 Aug 2017 15:40:44 +0000 Received: from ODTOSH2015 ([84.161.254.38]) by mail.gmx.com (mrgmx001 [212.227.17.190]) with ESMTPSA (Nemesis) id 0Lg1Tn-1dENrm0EId-00pbFa; Tue, 15 Aug 2017 17:40:41 +0200 From: "Jannick" To: Cc: "'Vermessung AVT - Wolfgang Rieger'" References: In-Reply-To: Subject: RE: gawk 4.1.4: CR separate char for CRLF files Date: Tue, 15 Aug 2017 15:41:00 -0000 Message-ID: <009a01d315dc$d8ad49a0$8a07dce0$@gmx.net> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-UI-Out-Filterresults: notjunk:1;V01:K0:XlfgPKj9/Yk=:hERWyRqwh2IlUpnR0EKlCx m7FM+6xdf34S85f8gRgdeCkDrgPHyePVlqhHHlcPhN0bFQzP6ZomAdGkh0Y5qHlq40viqX5lJ qUBUXVpaSYBrTjaxnPdV26lCujiylC/YtMIh7qWcjoXctO12a27tYdQ5RqLyGkRDK1JdTqffD 8mkQjykBxfBmJhgX9Q2EuIjO9KLQHox46ZRji7ivSGu1QEw2IqwZW/YyQJ9s5zF7AST+ursxr iXsw6COBC7jGip0XUyp2QA9KyCFJycbTrCDvY0X6Ofg/WdLuK0Nk+p8YiZoxF0gd4+3QdQ8nQ hk9cac7HWQaEU1TGjOLyAKuqqzlJ+RE4a1bPH7Ix3r4z8OQM4SSrbGv20brJPQ5qSHvCklLpc uAJa9Xr3GnFLtDbvDw9XAsZjJ2U5N/eMmDyTj3xI8LtacKRvp9h8cu9FKCAXjmgH+UNmsQh++ yP3yldH6fU/BfFGrJjqBpfAVsNfU56JaNIPwf+je881N7+8BWtwAed9xkiUBQr815AG0Jy8+K 3/gmXtqIqaqQArul9P1D3oXyuVLfJMOjvSDTV2HpLj2dGpNmbw3gF2DT+istAnvedSWs6/xYo 4CbQYSzDcGt3ixhT6qgXdKgb6JallBsK/IZSsxT8C8EshckM80nPF6UyScISULf8nlf/SZaAA JmfpVt9XtSW0bAwxvWSR547e1GCiwYtlEQMLq38AUENquuhAmSdcExXpAGnFCaM9L36zvA8i6 cpoQNJtNGB50zvjGunlMdXqO0tCqr15R6YPY/qxBrdnMiKMq2+j1F0zqKNwVToJBEfv94+B8i 9+tKZ/sNdbTjW1vTYim/hUtEYcaA7aylGxxxtxASJiqTl2j2iE= X-IsSubscribed: yes X-SW-Source: 2017-08/txt/msg00133.txt.bz2 Hi Wolfgang, First of all, many thanks for your interesting experience report and the constructive remarks. On Mon, 14 Aug 2017 10:36:23 +0000, Vermessung AVT - Wolfgang Rieger wrote: > Another solution which we have been using for many years now, though it > might not be feasible for you: Yes, you are right, unfortunately: We make extensive use of gawk extensions to upgraded with gawk in tandem. Thus we will move forward with the ongoing gawk development. > We very rarely update Cygwin. We have been using Cygwin for some 15+ > years now. We use tools like gawk (hundreds of scripts), head, tail, sort, etc. > that we are using in shell scripts running under cmd.exe (no Unix shells > involved). I soon realized that upgrades of Cygwin may cause troubles with > existing scripts, so we only update if we really need to (e.g.: New > functionality that would be important, 32 to 64 bit shift, eventually new > Windows versions, bugs we needed to be fixed). > > I have followed the discussions about the CR/LF behaviour changes in the > past attentively and decided not to update in near future, because that > would lead to a massive problem with many hundreds of scripts - hoping > that sometimes there will be a change in gawk again. Agree - this is the same setting here. Furthermore, we run our heavy processes on a semi-annual basis within a more than tight time frame. So cygwin's update came pretty much out of the blue in the late minute, because since the last reporting cycle we have not used gawk. An unpleasant surprise with heavy potential time issues if we had not taken the decision on how to deal with the changed situation. And as you are saying below ... > What is Unix-like or OS-like or Posix-like behaviour in that context? You could > argue that gawk interprets line endings like the underlying OS does (i. e., > gawk reads LF in Unix and CR/LF in Win), or it interprets line endings in a > Unix-style no matter of the underlying OS used. That's a developer's decision > in my opinion. True. And the developers of gawk opted - with a heavy heart I believe - to have gawk swallow CRs. > But since with pipes or output redirection gawk used to write no CRs even in > previous versions, we already had the problem that gawk had to accept > *both* inputs, LF with or without CR. That worked widely fine so far, since > most Windows and other application SW we use accept both record formats, > fortunately (we had issues with SW upgrades of other vendors no longer > accepting pure LF, but that only concerned a very small number of scripts). > With the new approach in Cygwin that seems to be broken, so we did not > upgrade Cygwin since then (we currently use gawk 4.1.3). Yes, this is our basis of SW selection process as well, but we march with gawk's version as it nicely develops needing a gawk version reading files and pipes of any LF and CRLF kind out of the box. > Of course the reason for that really annoying CR/LF thing is the arrogance > and ignorance of MS, which caused innumerable of useless developers' > hours when I think of the endless discussions and changes in Cygwin; but MS > is the one who defines the standards because of its very market power, so > we have to deal with it, if we like or not. I'd definitely prefer to use Unix for > its powerful tools, but most of the SW we use is simply not available for Unix, > and MS does not provide gawk etc. So we have to deal with that CR/LF issue > in a pragmatic rather than in a more, say, philosophical approach: We need > to run our scripts with as little changes as possible. So that's why we upgrade > Cygwin as seldom as possible. It is a "living system", yes, which is great on > the one side - but can be annoying in everyday practice. We are squared into the Windows world as well. So there's no way out of that. So far I was more than happy that the gawk code comes with the feature to silently swallow CRs (cf. the code reference with the exact code line in my previous posting) and that was used until the last update. Now that things - from our point of view - tremendously changed, we were urged to run a decision process looking at alternatives (I listed in my first email). The evaluation in the past days led us to the decision to use another source of bilingual versions of gawk and friends (i.e. they read CRLF and CR without any additional hint). This is what the user can opt for. > In my opinion there should be at least an option for gawk to accept both LF > and CR/LF line endings equally, preferably with a system variable so that > there is no need to change the command line call of gawk at all. That's what I > vote for. Fully agree - for this I would have been pretty much in favor as well. Something close to this I was having in mind in my first posting. > Kind regards, > Wolfgang Best regards, J. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple