From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 111143 invoked by alias); 31 Jan 2017 13:16:23 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 111135 invoked by uid 89); 31 Jan 2017 13:16:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-101.4 required=5.0 tests=AWL,BAYES_00,GOOD_FROM_CORINNA_CYGWIN,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=Hx-languages-length:2463, cooked, management, H*F:D*cygwin.com X-HELO: drew.franken.de Received: from mail-n.franken.de (HELO drew.franken.de) (193.175.24.27) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 31 Jan 2017 13:16:20 +0000 Received: from aqua.hirmke.de (aquarius.franken.de [193.175.24.89]) (Authenticated sender: aquarius) by mail-n.franken.de (Postfix) with ESMTPSA id CD650721E281E for ; Tue, 31 Jan 2017 14:16:17 +0100 (CET) Received: from calimero.vinschen.de (calimero.vinschen.de [192.168.129.6]) by aqua.hirmke.de (Postfix) with ESMTP id 4B68A5E00E5 for ; Tue, 31 Jan 2017 14:16:16 +0100 (CET) Received: by calimero.vinschen.de (Postfix, from userid 500) id 3427DA80412; Tue, 31 Jan 2017 14:16:16 +0100 (CET) Date: Tue, 31 Jan 2017 13:16:00 -0000 From: Corinna Vinschen To: cygwin@cygwin.com Subject: Re: [ANNOUNCEMENT] Updated: dash-0.5.8-3 Message-ID: <20170131131616.GC29504@calimero.vinschen.de> Reply-To: cygwin@cygwin.com Mail-Followup-To: cygwin@cygwin.com References: <58893f48.0850ca0a.6c5d.5fde@mx.google.com> <81b5af354b7a3925ff0a68dcc063265f@smtp-cloud6.xs4all.net> <20170131100402.GB29504@calimero.vinschen.de> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="/NkBOFFp2J2Af1nK" Content-Disposition: inline In-Reply-To: <20170131100402.GB29504@calimero.vinschen.de> User-Agent: Mutt/1.7.1 (2016-10-04) X-SW-Source: 2017-01/txt/msg00401.txt.bz2 --/NkBOFFp2J2Af1nK Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-length: 2523 On Jan 31 11:04, Corinna Vinschen wrote: > On Jan 28 14:44, Houder wrote: > > On Wed, 25 Jan 2017 16:14:00, Steven Penny wrote: > > > Obviously Bash is not the problem, nor readline as Dash doesnt use re= adline. So > > > it appears the issue this time is again with cygwin1.dll, or perhaps = the Dash > > > package. > >=20 > > .. uhm, it appears to me that Windows is the issue here. > >=20 > > As those in the know do not feel inclined to respond, I will provide so= me > > guesses that are my own: > >=20 > > - in terms of input buffer management, utf-8 encoded characters will n= ot > > be recognized in case of bash and dash ... (they are under Fedora) > > - see the output of stty -a: iutf8 is not present (it is under Fedo= ra) > > - readline provides bash with input buffer management for utf-8 encoded > > characters on Windows (that is why it 'works' in case of bash) > > - bash has support for utf-8 encoded characters ... > > (e.g. ls -l ? will include one-character filenames in case the name = is > > made up of only one multi-byte character) > > - dash has no such support ... [1][2] > >=20 > > Consequently, dash is only partly useful, even more so on Windows (as it > > would require an additional "helper" on Windows in order to obtain prop= er > > line-editing). Helper? readline, libedit ... > >=20 > > However, I am only guessing ... (only Erik and Corinna can provide expe= rt > > details here). >=20 > I'm not quite sure yet but apparently the problem is in the handling of > VERASE in the termios implementation. In cooked mode it fills a char > buffer with what has been typed. The code doesn't know if the bytes in > the buffer are UTF-8 chars or just random bytes. So VERASE erases > exactly one byte, which means, in case of UTF-8 chars it only erases the > last byte of of a mulitbyte character. >=20 > It seems the Linux termios implementation is different in that it > still knows which bytes constitute a single keypress and thus knows > how much byte it has to erase. Ok, here's what happens on Linux: The termios code support a flag IUTF8. This flag determines if the termios code checks for UTF8 characters in the input when performing an ERASE. It checks if the IUTF8 flag is set and if so, it checks in a loop if the just erased byte is a UTF-8 continuation character. If so, it erases another byte. Corinna --=20 Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat --/NkBOFFp2J2Af1nK Content-Type: application/pgp-signature; name="signature.asc" Content-length: 819 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJYkI4gAAoJEPU2Bp2uRE+gh/MP/32u6HocE9E+wuJJ6pOmZiq2 Som1negFDtQgFznqZmIAOwQ08TaeT1Zvx9XgUul2vIzOcdrwkk3LD80/YYbkrU8+ aN95FG6Y1v6jTyzTbPWaGXuoesrLxOrDdrZJXuDSTf43TmeAyvhGgSzx/Ucz8c2y RobqGrZgod9Bc+TgDqkGEQAUowO0zJ0wQr2Y5KQF1fFJmCOvolnSHDJfhCdLAJsx ryt7Gpsjl0zFyEmp95hCaDn0+3ahqd+D8T9lmHcNsSt+IO/CEHkQlUKGGRTkxu1I OFf8GqDXQZCVssEVHW35OUXWKosPWK92MaVKaFNZx007Npu+HVmwtNXG762gWS5K SqbHtHxQqRRN2mrO9TL1v/SaVAjibHtiYUnKcl5CMLEZDhbveSohpnLRG5phXGhL icJNtpMOzo23+HjRqKFlYOo+eEK5O4KhmY+0A1IjN8bflY5x+KB/wIziJXoq0Ji7 tE73BVR8K9jr2Q1V1gMxBGLtv1o9qC41eggpNgMp5DUXtth5DjaatiJR8aeg/bCi UXtuXqHk3AhEJadXJi67yHf/hXTp3Olk9obbDt9kcDz6itNFLk6o007jh3b9ozAf PomFWv/ssB98NOXRki6fXX1DpZGdlNSADVviYb1BBXoEizzQiiTEfdSEGJMXu+jn 4gNtpkjUOoG4Y+jLW6l+ =HVvo -----END PGP SIGNATURE----- --/NkBOFFp2J2Af1nK--