From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4386 invoked by alias); 19 Jul 2012 11:27:38 -0000 Received: (qmail 4375 invoked by uid 22791); 19 Jul 2012 11:27:37 -0000 X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00,KHOP_THREADED,SPF_HELO_PASS,TW_VM,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from plane.gmane.org (HELO plane.gmane.org) (80.91.229.3) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 19 Jul 2012 11:27:22 +0000 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1Srot6-00019t-9E for cygwin@cygwin.com; Thu, 19 Jul 2012 13:27:20 +0200 Received: from dslb-092-073-216-197.pools.arcor-ip.net ([92.73.216.197]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 19 Jul 2012 13:27:20 +0200 Received: from wiesweg by dslb-092-073-216-197.pools.arcor-ip.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 19 Jul 2012 13:27:20 +0200 To: cygwin@cygwin.com From: Ralf Subject: Re: length in gawk returns wrong value Date: Thu, 19 Jul 2012 11:27:00 -0000 Message-ID: References: <20120719092024.GA31055@calimero.vinschen.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit User-Agent: Loom/3.14 (http://gmane.org/) X-IsSubscribed: yes Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com X-SW-Source: 2012-07/txt/msg00390.txt.bz2 Corinna Vinschen cygwin.com> writes: > > Uh oh. 1.7.9 is old. Please update. > > > 0000000 R 374 c k e n \r \n > > 0000010 > > Length: 1 > > > > What can I do to get the correct length in gawk without changing > > ttt.txt? > > Dunno. This is not what I see. What did you have $LANG and $LC_CTYPE > set to? Here's what I see: > > $ uname -a > CYGWIN_NT-6.1 vmbert7 1.7.16(0.261/5/3) 2012-07-09 14:51 i686 Cygwin > > $ echo $LANG > C.UTF-8 > > $ echo "Rücken" > ttt.txt > $ od -c ttt.txt > 0000000 R 303 274 c k e n \n > 0000010 > > $ gawk '{print "Length: " length($0)}' ttt.txt > Length: 6 > > $ gawk --version | head -1 > GNU Awk 4.0.1 > > Corinna > After updating I added following lines on top of my script: export LANG=C.UTF-8 echo LANG: $LANG echo LC_CTYPE: $LC_TYPE c:/unix/bin/gawk --version | head -1 And this is my output: LANG: C.UTF-8 LC_CTYPE: GNU Awk 4.0.1 CYGWIN_NT-6.0-WOW64 WIESWEG 1.7.15(0.260/5/3) 2012-05-09 10:25 i686 Cygwin 0000000 R 374 c k e n \r \n 0000010 Length: 5 Very strange! But after adding export LC_CTYPE=C I got the correct result. Thanks for your quick help! -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple