From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 36961 invoked by alias); 28 Jul 2017 14:55:06 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 36927 invoked by uid 89); 28 Jul 2017 14:55:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=played, severely, DOES, interactions X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 28 Jul 2017 14:55:01 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 69A8C78F17 for ; Fri, 28 Jul 2017 14:55:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 69A8C78F17 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=eblake@redhat.com Received: from [10.10.120.22] (ovpn-120-22.rdu2.redhat.com [10.10.120.22]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0083A77E87 for ; Fri, 28 Jul 2017 14:54:59 +0000 (UTC) Subject: Re: [ANNOUNCEMENT] Updated: libreadline7-7.0.3-3 To: cygwin@cygwin.com References: <60db3460-a1c1-7cc2-0369-a7109d782ea0@redhat.com> <597b2988.4338ca0a.87fc4.4954@mx.google.com> From: Eric Blake Openpgp: url=http://people.redhat.com/eblake/eblake.gpg Message-ID: Date: Fri, 28 Jul 2017 23:55:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <597b2988.4338ca0a.87fc4.4954@mx.google.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="VErc6RvRwQRl8LK0vtTQXRiVXdNcn3NmB" X-IsSubscribed: yes X-SW-Source: 2017-07/txt/msg00411.txt.bz2 --VErc6RvRwQRl8LK0vtTQXRiVXdNcn3NmB Content-Type: multipart/mixed; boundary="ObfjA2JGTGnU9iFnXXD2X9ngNhG5K6PqA"; protected-headers="v1" From: Eric Blake To: cygwin@cygwin.com Message-ID: Subject: Re: [ANNOUNCEMENT] Updated: libreadline7-7.0.3-3 References: <60db3460-a1c1-7cc2-0369-a7109d782ea0@redhat.com> <597b2988.4338ca0a.87fc4.4954@mx.google.com> In-Reply-To: <597b2988.4338ca0a.87fc4.4954@mx.google.com> --ObfjA2JGTGnU9iFnXXD2X9ngNhG5K6PqA Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Content-length: 7887 On 07/28/2017 07:09 AM, Steven Penny wrote: > On Fri, 28 Jul 2017 06:41:08, Eric Blake wrote: >> As in /etc/defaults/Cygwin.bat installed by the base-files package? >=20 > As in "C:\cygwin64\Cygwin.bat" that can be found after a regular install = of > Cygwin >=20 Oh, that one doesn't show up under 'cygcheck -p', so it must be created by setup.exe. >> It's short: >> >> @echo off >> setlocal enableextensions >> set TERM=3D3D >> cd /d "%~dp0bin" && .\bash --login -i >=20 > Uh, no? Mine looks like this. Again this is the file installed by > Cygwin, not > something I have home brewed: >=20 > @echo off > C: > chdir C:\cygwin64\bin > bash --login -i Odd that there are two different flavors of the file - but the point remains that they are doing pretty much the same thing: starting bash. >=20 >> so if you are already in a cmd window, and typing cygwin.bat, then it is >> the same as if you are directly starting bash from cmd. >=20 > I am not doing this, I am just using Windows explorer to go to > "C:\cygwin64", > then double clicking "Cygwin.bat" Yay! That means you ARE running cygwin in a cmd window (and NOT a mintty window) - because that's what double-clicking the .bat file does. Okay, I'm a bit more confident that I'm at least in the right environment to reproduce what you are seeing. I still think the desktop shortcut pointing to mintty is a nicer environment than running bash in a cmd window, but that's orthogonal. >=20 >> By the way, I didn't know cygwin.bat was still around (I had to go >> hunting for it). Ages ago, that use to be the target of the shortcut >> installed to the desktop, but we switched quite a while ago to having >> the target point directly to mintty instead of the .bat file (since >> mintty is a LOT friendlier than running inside cmd). >=20 > I dont want mintty. As you know mintty adds another "layer" to the > situation, > another process. If I was using mintty I would not have discovered this > problem > in the first place. Some might say good, but no. It is important that even > launching Cygwin via Cygwin.bat/bash.exe works in a sane manner; and > that it not > sacrifice features that even cmd.exe has, which is the current situation. I agree that running under cmd should work as best as possible, but cmd is severely limited, and interactions with mintty get debugged faster than interactions with cmd. >=20 >> Also, what does 'chcp.com' say prior to you executing cygwin.bat >=20 > Prior to "Cygwin.bat", it says 65001, because I feel that is the proper > default > for Windows. I set it like this: >=20 > REG ADD 'HKCU\Console' /v CodePage /t REG_DWORD /d 65001 /f Ah, so I see 437 by default on my setup, but you've made a system-wide tweak. So PART of your issue is that cygwin1.dll might not be paying attention to your tweak (since the code page DOES affect what alt- sequences produce when using cmd) - it is highly possible that cygwin1.dll still needs improvements with regards to translating alt- sequences according to the current code page (when codepage 437 is active, alt-[1-9]-NNN works according to the current page; while alt-0-NNN according to Unicode code point - insofar as the code point is representable in the current code page). But my other mail pointed out that when using a plain cmd window (no cygwin in the mix), I'm not entirely sure how the alt- sequences work for code page 65001 in the first place, to know if cygwin is even interpreting things correctly. >=20 >> are you then trying to call chcp.com WHILE bash is running? I have no >> idea if >> cygwin is even aware/amenable of live code page changes while running >> inside a >> cmd window >=20 > Yes, However I do not think this is germain to the conversation, as I > have found > you can change it "live" without issue. Well, it IS relevant if it means that live changes to the code page SHOULD affect cygwin1.dll behavior dynamically. It's not relevant to whether bash itself is mishandling alt- sequences in general (my debugging so far says that with code page 437, typing alt-1-4-8 produces \xc3\xb6 (the correct UTF-8 encoding of =C3=B6, which is decimal code 148 in code-page 437), but that bash's parser loop (in readline) is reading only one byte at a time, and tries to treat \xc3 as 'meta-n' and triggers its incremental-search functionality, then treats \xb6 as an incomplete character to be searched for; rather than realizing that the two bytes belong together as a multibyte character. That part of readline MIGHT be related to using pselect() (where configuring pselect out of the equation takes a different code path to learn if more data needs to be read before starting to act on the data). I still haven't figured out why readline is breaking the input or how to fix it. >> Making me chase URLs doesn't help as much as a single mail, with a >> step-by-step reproduction of what you did vs. what you expected, so that >> I can refer to a single window rather than multiple browser tabs when >> trying to follow those same steps. >=20 > Cmon man, give me a break here, I am trying. I have been posting on this > issue > for months, and Id rather not keep regurgitation the same stuff over and > over. I know, you ARE helping. But I'm also saying that some forms of help are easier than others ("look at this link" is less helpful than "here are the steps"). And it doesn't help that my free time for cygwin is sporadic. > At any rate, here is the text from that link, > LATIN SMALL LETTER O WITH DIAERESIS' (U+00F6): >=20 > $ chcp.com 65001 > Active code page: 65001 >=20 > - Alt 148 outputs nothing > - Alt 0246 outputs nothing But that's true even in a bare cmd window. So it's hard to say what it is SUPPOSED to do. I'm trying to debug bash, not cmd's alt- behavior. > - Pasting this character does not work Pasting is different from alt- sequences, but I haven't played with that yet, because pasting in cmd windows is a pain compared to middle-click pasting in mintty (get the alt- stuff working first, and the pasting may magically work). >=20 > $ chcp.com 437 > Active code page: 437 >=20 > - Alt 148 works > - Alt 0246 works > - Pasting works Wait, it works for you in bash? It wasn't working for me. >=20 > and here is why 65001 is needed at all: >=20 > $ cat omega.c > #include > int main() { > printf("=CE=A9\n"); Bad form to assume your compiler is using a particular code set (but presumably you write that file in a UTF-8 editor); better would be: printf("\xce\xa9") which is unambiguously the UTF-8 bytes for U+03a9 (or even "\u03a9", if your C compiler is new enough). > } >=20 > $ x86_64-pc-cygwin-gcc -o cygwin.exe omega.c > $ x86_64-w64-mingw32-gcc -o mingw32.exe omega.c > $ chcp.com 65001 > Active code page: 65001 >=20 > $ ./cygwin.exe > =CE=A9 >=20 > $ ./mingw32.exe > =CE=A9 >=20 That makes sense: code page 65001 is the Unicode page, so ideally all UTF-8 sequences should hit the terminal as their appropriate Unicode glyphs. > $ chcp.com 437 > Active code page: 437 >=20 > $ ./cygwin.exe > =CE=A9 Cygwin may be at fault here: it is outputting a Unicode glyph even though the current code page is not Unicode. But I'm also not ruling out weirdness in cmd (it's not open source, so I can't prove who is at fault, after all). >=20 > $ ./mingw32.exe > =E2=95=AC=E2=8C=90 This is the correct two-character (not multi-byte) sequence to output according to the code page 437 glyph table (in that code page, \xce is the character u+256C, and \xa9 is the character u+2310). Spitting out UTF-8 bytes to a unibyte locale (which is what code page 437 is) is generally going to produce mojibake. Which is why using a unibyte locale is annoying. --=20 Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org --ObfjA2JGTGnU9iFnXXD2X9ngNhG5K6PqA-- --VErc6RvRwQRl8LK0vtTQXRiVXdNcn3NmB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" Content-length: 619 -----BEGIN PGP SIGNATURE----- Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEzBAEBCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAll7UEMACgkQp6FrSiUn Q2o/AQgAm3ZdpZyKVm6bJiNqBvCl2FqehAszrS/plkEoBXYmm7SEYnaHlSLbAA9c FSiNLgQaHQS6y4kkhpBa6l80WA8+AwTnelj9Nnt1G0YOWpHIuSt08jSV++Uk66je KdASdshtoyREiIlhCpdZpMyGAI/NidlSd+YjuetHTtxH/67o1+93usbSWTfasHcX hCGofkkRLgrwnscJMpeZv9TwuQGSV46X4Udz7TTrfX+GrkpVEng6TKM8UfI0pO4+ IvKez/ZaTAW4E7HNZrAPNcBtm21dGU3OVwH5dFnM9Uq7WtOfHUF86Jht68kiqsDD v5XJjWhSUlZ2TPyci6dT8LGL53FbCg== =l4Qn -----END PGP SIGNATURE----- --VErc6RvRwQRl8LK0vtTQXRiVXdNcn3NmB--