From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <743-406-3965@kylheku.com> Received: from smtp-out-so.shaw.ca (smtp-out-so.shaw.ca [64.59.136.138]) by sourceware.org (Postfix) with ESMTPS id A7ECD3858020 for ; Mon, 19 Oct 2020 02:32:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A7ECD3858020 Received: from kylheku.com ([70.79.163.252]) by shaw.ca with ESMTPA id UKy3kUKKMktFkUKy4kYert; Sun, 18 Oct 2020 20:32:13 -0600 X-Authority-Analysis: v=2.4 cv=NYRYa0P4 c=1 sm=1 tr=0 ts=5f8cfaad a=95A0EdhkF1LMGt25d7h1IQ==:117 a=95A0EdhkF1LMGt25d7h1IQ==:17 a=IkcTkHD0fZMA:10 a=SMorJkV_YP8A:10 a=afefHYAZSVUA:10 a=vpfx9Xo2_UxmPigGMZAA:9 a=QEXdDO2ut3YA:10 Received: from www-data by kylheku.com with local (Exim 4.72) (envelope-from <743-406-3965@kylheku.com>) id 1kUKy3-0002gB-1q; Sun, 18 Oct 2020 19:32:11 -0700 To: =?UTF-8?Q?J=C3=A9r=C3=B4me_Froissart?= Subject: Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments X-PHP-Originating-Script: 501:rcmail.php MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Sun, 18 Oct 2020 19:32:11 -0700 From: "Kaz Kylheku (Cygwin)" <743-406-3965@kylheku.com> Cc: cygwin@cygwin.com In-Reply-To: References: <634821436.20201004141809@yandex.ru> Message-ID: <6a30ae30f769cba0dbf7a80423c20ac1@mail.kylheku.com> X-Sender: 743-406-3965@kylheku.com User-Agent: Roundcube Webmail/0.9.2 X-CMAE-Envelope: MS4xfClbFHkn0Fz3byar/+cKhxmUuXYmARsvOOuypKgHOCippCOXCBw7KVZ3HCk0kPRvB89bYxwiAf9KyLixK5vTG0JFQtS1WBJrbCGwjU+vNSTqgZ/7AAMm G6K0WK/5xD73pMugUBtzSEU7yLTHPoW7f59rmmlbWX4uQ/eAvoJgoypxn/85VREg4N6tT9IHbb0ujnxnWFZWaBpcUsnRkVzFWqtehGGPfQOSOxdPFpcP2QD6 X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, FROM_STARTS_WITH_NUMS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Oct 2020 02:32:16 -0000 On 2020-10-14 14:47, Jérôme Froissart wrote: >> The choice of GetCommandLineA was for illustration purposes; >> had I used GetCommandLineW I would not be able to printf >> using %ls under CMD.EXE, because of code page issues. However >> here is a modified version of the test program that uses >> GetCommandLineW. [ ... ] >> billziss@xps:~/Projects/t$ ./cyg.exe "foo bar" "Domain\Jérôme" >> 0022 " 0043 C 003a : 005c \ 0055 U 0073 s 0065 e 0072 r >> 0073 s 005c \ 0062 b 0069 i 006c l 006c l 007a z 0069 i >> 0073 s 0073 s 005c \ 0050 P 0072 r 006f o 006a j 0065 e >> 0063 c 0074 t 0073 s 005c \ 0074 t 005c \ 0063 c 0079 y >> 0067 g 002e . 0065 e 0078 x 0065 e 0022 " [ ... ] >> C:\Users\billziss\Projects\t>cyg.exe "foo bar" "Domain\Jérôme" >> 0063 c 0079 y 0067 g 002e . 0065 e 0078 x 0065 e 0020 >> 0020 0022 " 0066 f 006f o 006f o 0020 0062 b 0061 a >> 0072 r 0022 " 0020 0022 " 0044 D 006f o 006d m 0061 a >> 0069 i 006e n 005c \ 004a J 00e9 . 0072 r 00f4 . 006d m >> 0065 e 0022 " Aha! There is a hint of a problem here. Firstly, the command lines are obviously different. The Cygwin one starts with a quote that we did not see, wrapping the full path to the executable: "C:\Users\billziss\Projects\t\cyg.exe" It ends there. Why is that? I'm guessing that the command line was tokenized destructively; a null character was written. But under cmd.exe, we see the whole command line, without any null character having been written in it. Moreover, the program name just appears as the original relative path cyg.exe with no quotes. What a mess. :)