From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.134]) by sourceware.org (Postfix) with ESMTPS id DB4DD3858D20 for ; Sat, 3 Dec 2022 15:24:55 +0000 (GMT) Authentication-Results: sourceware.org; dmarc=permerror header.from=cygwin.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=cygwin.com Received: from calimero.vinschen.de ([24.134.7.25]) by mrelayeu.kundenserver.de (mreue011 [212.227.15.167]) with ESMTPSA (Nemesis) id 1M2ep5-1ozYi11clT-004ENH for ; Sat, 03 Dec 2022 16:24:54 +0100 Received: by calimero.vinschen.de (Postfix, from userid 500) id D15E5A80891; Sat, 3 Dec 2022 16:24:53 +0100 (CET) Date: Sat, 3 Dec 2022 16:24:53 +0100 From: Corinna Vinschen To: cygwin@cygwin.com Subject: Re: [BUG core?] Regression =?utf-8?Q?with_?= =?utf-8?Q?parsing_Windows=E2=80=99?= command-line Message-ID: Reply-To: cygwin@cygwin.com Mail-Followup-To: cygwin@cygwin.com References: <20221116124824.zzobomcsmowvjtbr@math.berkeley.edu> <20221203034030.a6ghnwcze4rkqeap@math.berkeley.edu> <20221203192810.03c73015303ef3ad4fe241f3@nifty.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20221203192810.03c73015303ef3ad4fe241f3@nifty.ne.jp> X-Provags-ID: V03:K1:JDOIwURlTFcyt7op9sbC+HzzO0e+TpDaFflXT75gGS83nj2SwK8 PPhPX3+QlJiqI0W3KIY0JECIp//EssZLdmyIWjn99kcoYmTIRr/92E8ZtrcD1YUNAnptXJZ s2zYIpeFGvtAC/S0OziQ/PuatUsCrnbAsZ71nGTCgvq7PMYDeyFh9Fj9m/KsYzq2k/BWcSF LfR94TP3sudIhZ6/daKBw== X-UI-Out-Filterresults: notjunk:1;V03:K0:MK7koHmj03A=:jmllCn96v2G77j1m01U5KI DWHthlsolT3OeRUKbUkdaqYtg/tUYIOd4tKD5Oy8ichnquK49HX1uVRa2YcW7ZRtKXmq0qCjW O0nHDnu8CSnmU6eS6mAlQM01qBRAMzkyuzUh3czxzBODZNxi4ShP8cUPl7YlX+FfR/N9bjmol mi3ZcLRvQM2OdYjKSFAk61Nnv0laqAGpFE4ubAiNJGY/C2GFh1q1i+VV7g2zR0WM+cql5IKQX o8CQWoxv73dmibBHO1wiEKTURIZVfBga0QjXVEX0tMppmXt5xA/J+JI4eljPS0dDQqGCXxdfn byZc1o8gacoO6EsOEm3rF8pcB0RmY3REW65iRPsOJNpObSqLzkMmFd9U4Eh7m/4jJa+vMYjuv aX0ksSAQqq5mqi+2ywHayv6WyaxIut22poAnNXdHgAmVx3gqKGY/J2FiZR7g60BaVPK8jjsns nQ4EjEm2qBTKn5i6a5FoDwkJw9mGSh+WzdwzzUGxmipcakqqYjbIsXs610+KfRls6B4PKVhfc IPlAEFdr+z4J6vVTW3H1Eh0AULw8BHgzEDs2rSWliwNhOjAZy9k0ip2LtPN5QB9C0kLEUWs1C dJEqBioAQWlV+1CkPcWIu2bnvcVIrHGtatOsxsXWTDyRq9OTCrB1VMURHSlGpceWwTHAfEGa9 cgLOL8ZxHyuyndC6/ytd+bsLhdQVmpGutfrb6yD/9q79RQWyKQYAkVKVvQxHbg9il1Vd6NDxh 02UIymkCaY5kXMoRdOf6GXb8iWYSLo+NgrzePrhjZgRKfcQzPitkDWawyBm28Ox2YhqgQXw6f 8TWCkTGW1WKA82VSzFwSnyPdbkXZg== X-Spam-Status: No, score=-101.3 required=5.0 tests=BAYES_00,GIT_PATCH_0,GOOD_FROM_CORINNA_CYGWIN,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Dec 3 19:28, Takashi Yano via Cygwin wrote: > On Fri, 2 Dec 2022 19:40:30 -0800 > Ilya Zakharevich wrote: > > On Wed, Nov 16, 2022 at 04:48:25AM -0800, I wrote: > > > De-quoting (converting the Windows’ command-line into argc/argv) does > > > not remove double quotes if characters not fit for 8-bit (?) are present. > > > > > > To reproduce, do in CMD’s command line: > > > > > > D:\> D:\Programs\cygwin2022\bin\perl -wle "print for @ARGV" . "/i/" "/и/" . > > > . > > > /i/ > > > "/и/" > > > . > > [...] > This certainly seems to be a problem of cygwin1.dll. > > Though I am not sure this is the right thing, I have confirmed > that the following patch solves the issue. > > diff --git a/newlib/libc/locale/lctype.c b/newlib/libc/locale/lctype.c > index 644669765..732d132e1 100644 > --- a/newlib/libc/locale/lctype.c > +++ b/newlib/libc/locale/lctype.c > @@ -25,11 +25,20 @@ > > #define LCCTYPE_SIZE (sizeof(struct lc_ctype_T) / sizeof(char *)) > > +#ifdef __CYGWIN__ > +static char numsix[] = { '\6', '\0'}; > +#else > static char numone[] = { '\1', '\0'}; > +#endif > > const struct lc_ctype_T _C_ctype_locale = { > +#ifdef __CYGWIN__ > + "UTF-8", /* codeset */ > + numsix /* mb_cur_max */ > +#else > "ASCII", /* codeset */ > numone /* mb_cur_max */ > +#endif Good idea, but this transforms the "C" locale into the "C.UTF-8" locale once and for all. What we're actually missing is a matching _C_utf8_ctype_locale which can be used by Cygwin as default locale setting, AFAICS. I pushed a patch and the test release is rebuilding while I type. Thanks, Corinna