From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 33479 invoked by alias); 22 Mar 2018 21:56:41 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 33466 invoked by uid 89); 22 Mar 2018 21:56:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=2.1 required=5.0 tests=BAYES_00,BODY_8BITS,FREEMAIL_FROM,GARBLED_BODY,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=no version=3.3.2 spammy=UD:mail.ru, UD:ru, H*u:6.1, H*UA:6.1 X-HELO: smtp57.i.mail.ru Received: from smtp57.i.mail.ru (HELO smtp57.i.mail.ru) (217.69.128.37) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 22 Mar 2018 21:56:38 +0000 Received: by smtp57.i.mail.ru with esmtpa (envelope-from ) id 1ez8CJ-0004qi-AW for cygwin@cygwin.com; Fri, 23 Mar 2018 00:56:35 +0300 Received: from [192.168.1.85] (Furia.home [192.168.1.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by centurion.home (Postfix) with ESMTPSA id E93CFD31DCE for ; Thu, 22 Mar 2018 22:56:33 +0100 (CET) Subject: Re: Quotes around command-line argument that has unicode characters are not removed To: cygwin@cygwin.com References: <08d9621d-b9a0-c0d7-b58b-581ab957a08c@mail.ru> <20180322152437.a37c3dd3b778bba765e2124c@inbox.ru> <162182215.20180322162501@yandex.ru> From: "Dmitry Katsubo via cygwin" Reply-To: Dmitry Katsubo Message-ID: <93d66ed8-4dea-ddec-e731-43301ce57271@mail.ru> Date: Thu, 22 Mar 2018 22:21:00 -0000 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <162182215.20180322162501@yandex.ru> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Authentication-Results: smtp57.i.mail.ru; auth=pass smtp.auth=dma_k@mail.ru smtp.mailfrom=dma_k@mail.ru X-7FA49CB5: 0D63561A33F958A5A27B1BCA46B3C4F630B604AC49FA23DB89B8658562CC1CD6725E5C173C3A84C311BA4339981C382A17C7C968CF0A659DBAAD9279A72BC9ABC4224003CC836476C0CAF46E325F83A50BF2EBBBDD9D6B0F05F538519369F3743B503F486389A921A5CC5B56E945C8DA X-Mailru-Sender: 6DAAA20F2058E07D134D6D8D77B89E7FCF04DA8F659BFC66FDDE539238E4C0ECFCC05D241E210761501E7C294F69090ED50E20E2BC48EF5AE609D43F356B221EEAB4BC95F72C04283CDA0F3B3F5B9367 X-IsSubscribed: yes X-SW-Source: 2018-03/txt/msg00364.txt.bz2 On 2018-03-22 14:25, Andrey Repin wrote: > Greetings, Mikhail Usenko! > >> In bare cmd.exe native-msvcrt binary is working OK with quoted non-ascii >> arguments, while cygwin-flavor binary is not. But I don't know exactly which >> level here: cmd.exe or msvcrt.dll/cygwin1.dll is responsible for >> such a behavior. Thanks, Mikhail! I generally agree with you. If you follow the links I've provided in my original mail, you can see that cmd.exe does not do any argument splitting. I also see that from this method signature [1]: build_argv (char *cmd, char **&argv, int &argc, int winshell) which basically takes a string as input and returns an array of strings plus number of arguments as output. So this is either done by msvcrt.dll or by cygwin1.dll and they have different ways of doing that, which is OK provided it is documented and done consistently. I refer back to dcrt0.cc where the woodoo is done. In particular in line 165 [2] it checks that execution was performed from bare Windows, and behaves differently. On 2018-03-22 12:24, Andrey Repin wrote: > Run it in bash. I'm pretty sure you will see your results more consistent. When "test.exe" is run from bash, it behaves correctly because as you said bash did the most of dirty work. I also tried to workaround like below, but it does not work: D:\cli> bash -c "./test 'текст плюс.txt'" bash: ./test 'текст плюс.txt': No such file or directory > Locale settings affecting Cygwin binary. > > If you > set LANG=ru_RU.CP866 > (f.e.) > before invoking cygwin testcase in native CMD, you will likely see it > working better. Thanks for this advise, Andrey. I see that it reacts, but works worth :) I think it advises to output characters in CP866, but console is UTF-8: D:\cli> set LANG=ru_RU.CP866 D:\cli> test "текст плюс.txt" param 0 = test param 1 = ⥪▒▒ ▒▒▒▒.txt Failed to open '⥪▒▒ ▒▒▒▒.txt': No such file or directory But.. ta-da! I made it working like that: D:\cli> set LANG=ru_RU.UTF-8 D:\cli> test "текст плюс.txt" param 0 = test param 1 = текст плюс.txt File 'текст плюс.txt' was opened Hooray, it worked! > Alternatively, you could try > chcp 65001 That does not help: D:\cli> chcp 65001 Active code page: 65001 D:\cli> test "текст плюс.txt" param 0 = test param 1 = "текст плюс.txt" Failed to open '"текст плюс.txt"': No such file or directory [1] https://github.com/openunix/cygwin/blob/master/winsup/cygwin/dcrt0.cc#L297 [2] https://github.com/openunix/cygwin/blob/master/winsup/cygwin/dcrt0.cc#L165 -- With best regards, Dmitry -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple