From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 98450 invoked by alias); 14 Aug 2017 10:36:32 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 96376 invoked by uid 89); 14 Aug 2017 10:36:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.9 required=5.0 tests=BAYES_00,GIT_PATCH_2,KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2 spammy=H*F:D*at, endless, endings, vote X-HELO: mx-relay06-haj2.antispameurope.com Received: from mx-relay06-haj2.antispameurope.com (HELO mx-relay06-haj2.antispameurope.com) (83.246.65.206) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 14 Aug 2017 10:36:29 +0000 Received: from 185-58-53-14.customers.tirolnet.com ([185.58.53.14]) by mx-gate06-haj2.antispameurope.de; Mon, 14 Aug 2017 12:36:26 +0200 Received: from EXSRV01.avt-imst.local ([fe80::68b9:c891:f307:72f]) by EXSRV01.avt-imst.local ([fe80::68b9:c891:f307:72f%14]) with mapi id 14.03.0294.000; Mon, 14 Aug 2017 12:36:24 +0200 From: Vermessung AVT - Wolfgang Rieger To: "cygwin@cygwin.com" Subject: RE: gawk 4.1.4: CR separate char for CRLF files Date: Mon, 14 Aug 2017 10:36:00 -0000 Message-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-cloud-security-sender:w.rieger@avt.at X-cloud-security-recipient:cygwin@cygwin.com X-cloud-security-Virusscan:CLEAN X-cloud-security-disclaimer: This E-Mail was scanned by E-Mailservice on mx-gate06-haj2 with CF5A679401A X-cloud-security-connect: 185-58-53-14.customers.tirolnet.com[185.58.53.14], TLS=1, IP=185.58.53.14 X-cloud-security:scantime:.6737 X-SW-Source: 2017-08/txt/msg00130.txt.bz2 On Wed, 9 Aug 2017 10:38 +0000, Jannick wrote: --- snip --- > Now I can see the following *easy* solutions to the very situation here (= input only for now): > > 1 - Inserting the BEGIN section as you suggested into more than 1k script= s (not feasible due to additional regression test workload)=20 > > 2 - Calling 'gawk -vRS=3D\r\n -vORS=3D\r\n' instead of 'gawk' (hack to tu= rn back the additional the latest gawk's complexity, wrapper needed) > > 3 - Wrapping a d2u/u2d pipe solution (additional app and wrapper needed a= gain) > > 4 - Using another compiled version of gawk which does *not* disable the o= ut-of-the-box gawk feature to swallow CRs (cf., e.g., http://git.savannah.g= nu.org/cgit/gawk.git/tree/awkgram.y#n3543), i.e. > without the artificial obstacle to now know the EOL type of the input fil= e ahead of running gawk. > >> It works in all my cases. The only disadvantage: you have to know what k= ind > >... plus the disadvantage to systematically amend all the scripts instead = of having an external solution=20 > >> of files you want to handle in the awk script. The same awk script=20 >> will not >> work for DOS files as well as for linux files. > >... another issue originated by the change and which didn't exist before. > >> Best >>=20 >> Roger > > Please don't get me wrong, but this raises a real issue here and I am not= sure which rationale other than 'let's get more of the Linux-feel' drove t= he decision. > > All the best, > J.=20 --- snip --- Another solution which we have been using for many years now, though it mig= ht not be feasible for you: We very rarely update Cygwin. We have been using Cygwin for some 15+ years = now. We use tools like gawk (hundreds of scripts), head, tail, sort, etc. t= hat we are using in shell scripts running under cmd.exe (no Unix shells inv= olved). I soon realized that upgrades of Cygwin may cause troubles with exi= sting scripts, so we only update if we really need to (e.g.: New functional= ity that would be important, 32 to 64 bit shift, eventually new Windows ver= sions, bugs we needed to be fixed). I have followed the discussions about the CR/LF behaviour changes in the pa= st attentively and decided not to update in near future, because that would= lead to a massive problem with many hundreds of scripts - hoping that some= times there will be a change in gawk again. What is Unix-like or OS-like or Posix-like behaviour in that context? You c= ould argue that gawk interprets line endings like the underlying OS does (i= . e., gawk reads LF in Unix and CR/LF in Win), or it interprets line ending= s in a Unix-style no matter of the underlying OS used. That's a developer's= decision in my opinion. But since with pipes or output redirection gawk used to write no CRs even i= n previous versions, we already had the problem that gawk had to accept *bo= th* inputs, LF with or without CR. That worked widely fine so far, since mo= st Windows and other application SW we use accept both record formats, fort= unately (we had issues with SW upgrades of other vendors no longer acceptin= g pure LF, but that only concerned a very small number of scripts). With th= e new approach in Cygwin that seems to be broken, so we did not upgrade Cyg= win since then (we currently use gawk 4.1.3). Of course the reason for that really annoying CR/LF thing is the arrogance = and ignorance of MS, which caused innumerable of useless developers' hours = when I think of the endless discussions and changes in Cygwin; but MS is th= e one who defines the standards because of its very market power, so we hav= e to deal with it, if we like or not. I'd definitely prefer to use Unix for= its powerful tools, but most of the SW we use is simply not available for = Unix, and MS does not provide gawk etc. So we have to deal with that CR/LF = issue in a pragmatic rather than in a more, say, philosophical approach: We= need to run our scripts with as little changes as possible. So that's why = we upgrade Cygwin as seldom as possible. It is a "living system", yes, whic= h is great on the one side - but can be annoying in everyday practice. In my opinion there should be at least an option for gawk to accept both LF= and CR/LF line endings equally, preferably with a system variable so that = there is no need to change the command line call of gawk at all. That's wha= t I vote for. Kind regards, Wolfgang -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple