From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 73352 invoked by alias); 22 Mar 2018 12:24:51 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 72888 invoked by uid 89); 22 Mar 2018 12:24:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.3 required=5.0 tests=AWL,BAYES_50,BODY_8BITS,FREEMAIL_FROM,GARBLED_BODY,GIT_PATCH_2,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=opened, displays, Dmitry, Katsubo X-HELO: smtp63.i.mail.ru Received: from smtp63.i.mail.ru (HELO smtp63.i.mail.ru) (217.69.128.43) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 22 Mar 2018 12:24:47 +0000 Received: by smtp63.i.mail.ru with esmtpa (envelope-from ) id 1eyzGu-0003gP-PQ; Thu, 22 Mar 2018 15:24:45 +0300 Date: Thu, 22 Mar 2018 13:35:00 -0000 From: "Mikhail Usenko via cygwin" Reply-To: Mikhail Usenko To: cygwin@cygwin.com Cc: Dmitry Katsubo Subject: Re: Quotes around command-line argument that has unicode characters are not removed Message-Id: <20180322152437.a37c3dd3b778bba765e2124c@inbox.ru> In-Reply-To: <08d9621d-b9a0-c0d7-b58b-581ab957a08c@mail.ru> References: <08d9621d-b9a0-c0d7-b58b-581ab957a08c@mail.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Authentication-Results: smtp63.i.mail.ru; auth=pass smtp.auth=cygwin@inbox.ru smtp.mailfrom=cygwin@inbox.ru X-7FA49CB5: 0D63561A33F958A5B278720BC4FE4CBF8726715E7D73D2BBB0D202E9FE370A48725E5C173C3A84C3BBEB9517EB7DA65B6DC4115F85331DDCA3CCBC2573AEBDE1C4224003CC836476C0CAF46E325F83A50BF2EBBBDD9D6B0F5D41B9178041F3E72623479134186CDE6BA297DBC24807EABDAD6C7F3747799A X-Mailru-Sender: 6EC2BC89932334D1C4882107FE46CFD9917A976CB42A8B6C370CBDC3EF75A26F571D99010C863CB857E81BA083882096793588E7E1BA3EF16610D2B9C0EC4D78B1D210AF280BDE3A67452AF1AC6CC01554A42CAEBACFBF7EAE208404248635DF X-IsSubscribed: yes X-SW-Source: 2018-03/txt/msg00346.txt.bz2 On Thu, 22 Mar 2018 01:15:00 +0100 Dmitry Katsubo via cygwin <...> wrote: > Dear Cygwin community, > > I observe the following on my Cygwin: when I put quotes around file that has > non-ASCII symbols, these quotes are passed to argv of the process literally, > otherwise they are removed. I would expect that there is a consistency. > > I have written a small C program that displays arguments, and run it three > times: > > #1 For the file with space, taken into quotes ("the file.txt") -- OK > #2 For the file with non-ASCII characters (Château.txt) -- OK > #3 For the file with non-ASCII characters, taken into quotes ("Château.txt") -- WRONG > > d:\cli> uname -a > CYGWIN_NT-6.1-WOW PC 2.9.0(0.318/5/3) 2017-09-12 10:41 i686 Cygwin > > D:\cli> chcp > Active code page: 866 > > D:\cli> dir > ...cut... > 2018-03-22 00:43 0 Château.txt > 2018-03-22 00:01 393 test.c > 2018-03-22 00:01 150,230 test.exe > 2018-03-21 00:15 186 test.pl > 2018-03-22 00:43 0 the file.txt > 2018-03-22 00:40 16 текст плюс.txt > 6 File(s) 150,825 bytes > 2 Dir(s) 41,972,293,632 bytes free > > D:\cli> test "the file.txt" > param 0 = test > param 1 = the file.txt > File 'the file.txt' was opened > > D:\cli> test Château.txt > param 0 = test > param 1 = Château.txt > File 'Château.txt' was opened > > D:\cli> test "Château.txt" > param 0 = test > param 1 = "Château.txt" > Failed to open '"Château.txt"': No such file or directory > > As one can see, the last run fails. I am a bit puzzled: how can I pass the name > of the file with space and Unicode symbols? I need to do it in uniform way, as I > am calling a Cygwin program from native Windows program, as in [1]. > > D:\cli> test "текст плюс.txt" > param 0 = test > param 1 = "текст плюс.txt" > Failed to open '"текст плюс.txt"': No such file or directory > > I have search a bit, but I couldn't find a direct answer. From post [1] and [2] > I see that compiler inserts the code to do some argument pre-processing like > @pathnames [3], but what are exactly the rules? Is quote pre-processing done in > dcrt0.cc:177 [4]? > > Any feedback is appreciated. > > [1] https://sourceware.org/ml/cygwin/2016-05/msg00082.html > [2] http://daviddeley.com/autohotkey/parameters/parameters.htm > [3] https://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-at > [4] https://github.com/openunix/cygwin/blob/master/winsup/cygwin/dcrt0.cc#L177 > > === test.c === > #include > #include > #include > > int main(int argc, char* argv[]) > { > for (int i = 0; i < argc; i++) > { > printf("param %d = %s\n", i, argv[i]); > } > FILE* f = fopen(argv[1], "r"); > if (f != NULL) > { > printf("File '%s' was opened\n", argv[1]); > fclose(f); > } else { > printf("Failed to open '%s': %s\n", argv[1], strerror(errno)); > } > return 0; > } > > -- Hello, Dmintry, consider these test cases: Native (msvcrt) binary: ----------------------- $ x86_64-w64-mingw32-gcc test.c -o test-win.exe $ ldd test-win.exe ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7fa05900000) KERNEL32.DLL => /cygdrive/c/Windows/system32/KERNEL32.DLL (0x7fa030e0000) KERNELBASE.dll => /cygdrive/c/Windows/system32/KERNELBASE.dll (0x7fa028f0000) msvcrt.dll => /cygdrive/c/Windows/system32/msvcrt.dll (0x7fa03220000) ----------------------- Cygwin-flavor binary: --------------------- $ gcc test.c -o test-cygwin.exe $ ldd test-cygwin.exe ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7fa05900000) KERNEL32.DLL => /cygdrive/c/Windows/system32/KERNEL32.DLL (0x7fa030e0000) KERNELBASE.dll => /cygdrive/c/Windows/system32/KERNELBASE.dll (0x7fa028f0000) cygwin1.dll => /usr/bin/cygwin1.dll (0x180040000) --------------------- Create a file with non-ascii chars in the name: ----------------------------------------------- $ touch "текст плюс.txt" ----------------------------------------------- Run both binaries in mintty with bash: -------------------------------------- $ ./test-win "текст плюс.txt" param 0 = D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed\test-win.exe param 1 = ▒▒▒▒▒ ▒▒▒▒.txt File '▒▒▒▒▒ ▒▒▒▒.txt' was opened $ ./test-cygwin "текст плюс.txt" param 0 = ./test-cygwin param 1 = текст плюс.txt File 'текст плюс.txt' was opened -------------------------------------- Run the binaries in cmd.exe with bash: -------------------------------------- $ ./test-win "текст плюс.txt" param 0 = D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed\test-win.exe param 1 = ЄхъёЄ яы■ё.txt File 'ЄхъёЄ яы■ё.txt' was opened $ ./test-cygwin "текст плюс.txt" param 0 = ./test-cygwin param 1 = текст плюс.txt File 'текст плюс.txt' was opened -------------------------------------- Run in bare cmd.exe (/usr/bin/cygwin1.dll should be copied next to ./test-cygwin.exe) ------------------- D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed>.\test-win.exe "текст плюс.txt" param 0 = .\test-win.exe param 1 = ЄхъёЄ яы■ё.txt File 'ЄхъёЄ яы■ё.txt' was opened D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed>.\test-cygwin.exe "текст плюс.txt" param 0 = ./test-cygwin param 1 = "текст плюс.txt" Failed to open '"текст плюс.txt"': No such file or directory ------------------- In bare cmd.exe native-msvcrt binary is working OK with quoted non-ascii arguments, while cygwin-flavor binary is not. But I don't know exactly which level here: cmd.exe or msvcrt.dll/cygwin1.dll is responsible for such a behavior. -- -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple