From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.kundenserver.de (mout.kundenserver.de [217.72.192.74]) by sourceware.org (Postfix) with ESMTPS id 964F1385E006 for ; Fri, 27 Mar 2020 14:43:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 964F1385E006 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=towo.net Authentication-Results: sourceware.org; spf=none smtp.mailfrom=towo@towo.net Received: from [192.168.178.45] ([95.90.246.248]) by mrelayeu.kundenserver.de (mreue109 [212.227.15.183]) with ESMTPSA (Nemesis) id 1N2E9Y-1jN2zg1qXe-013cyy for ; Fri, 27 Mar 2020 15:43:42 +0100 Subject: Re: WSL symbolic links To: cygwin@cygwin.com References: <294944bd-757e-4bb8-6bcc-ceb8d9190ef9@towo.net> <20200326110059.GG3261@calimero.vinschen.de> <901e1c08-46cb-555b-c0d0-5bf9c726b3ca@SystematicSw.ab.ca> <20200326195615.GJ3261@calimero.vinschen.de> <8e330046-97bb-ba3e-15dc-25aedf915d90@towo.net> <20200327112115.GL3261@calimero.vinschen.de> <20200327130133.GM3261@calimero.vinschen.de> From: Thomas Wolff X-Tagtoolbar-Keys: D20200327154343083 Message-ID: <26e34f6a-f012-f786-bb50-7b77e263f6c8@towo.net> Date: Fri, 27 Mar 2020 15:43:43 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <20200327130133.GM3261@calimero.vinschen.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Provags-ID: V03:K1:w4IT5TSHp+jSrqYt4ylYGHCL9CvDOkezektts4GZy3A6+c1J1gE IQhbHSjoseYWbML1tlMBv7PGWxVzOSzkFHbLD6eNyBjo/axb/dcOPzeZIYEho0DmmH1mClP gYsefwIl0B04qE9mwqj+JsoLecTbgTo/GUYeEDD8NSYv+4rDLB37p3UnDlTTIayheFmUzB7 TKbBUuhsVT/sMi6ARugFA== X-UI-Out-Filterresults: notjunk:1;V03:K0:E+pt0UkxWrw=:61BHx4lM7ceKN1WwA3UKzJ MAVpy9XjpJjBS7JNmOAkvXROK49xE4Xf00Ny9z3fcCdnFb/X6DjVpwvIoMTTJkFab5RDzDWs3 ds+4vOlWhllCyEedqs5Lq/aK4zlFt2u+CDjvcJSGgCk+EurMvMsQETOnSstakpaelQIo7xQp3 YdmyzwuJUGoimsNqLjI5/qfwVL0FfcEgbRVGc/wOOEtqazJ7u+3KFT09noJfMGoiA8LWaEonE xOGzjjkRxPm45MhggWHGGadR2uevZx8tFPC4NtcjCxzWUQWLXRiSmmc61OkG9L2y97TMt8/Fa 1LoczzjUm2OGk5G4ONkOBKv9coZXGWN9wM2H5lSTUJyoPO/QdJIRb3sFlZY0mQizS6EcUukwj rRwPAhSWY2W2e2No0d6hmpfRVMfcMFFlTzhz+Xp1zmhjhVVWOabY+8leqMu2yCru1iLAZtSBN Um0z5PqVvJx8l6dBIaQsvXaCe48jll0ezyu79aUf7KH4FeSqja/THl95Hq0g7D0o0+m2A5Rq/ 8QKzs1lyQvAJc4a4uGYlmNpQv5O2P4/CQuv60wPX6D5xYGi2CnBh9DvM1FSlc20LuLskmGXYO AqccFeWYJp7i/8iF5nDBSV3qkf9mGHDWs/TL83XbpsB/Fk0gwJZyBJvJ96+ctpyQjO1PTv2TW fClLZAo163Z458Fj5X/Gj5S9ifoRyAccI+vBFA8BffvCB0iwckiZrRQP7aUqt55gqG5JUFd08 U2WbfUERhEE7K0ufJAKDnn4JyQqphuc8HENdio+OrXaQhfiW5vAiNaZtx83lM+Rf1XM7KvlxN xS0vQ+erJo/LMGFFiTe+kJrP9hFLqVYZ9mjK5uBnUpBrEr8ZnY= X-Spam-Status: No, score=4.5 required=5.0 tests=BAYES_00, BODY_8BITS, GARBLED_BODY, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Level: **** X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2020 14:43:45 -0000 Am 27.03.2020 um 14:01 schrieb Corinna Vinschen: > On Mar 27 13:24, Thomas Wolff wrote: >> Am 27.03.2020 um 12:21 schrieb Corinna Vinschen: >>> On Mar 27 00:52, Thomas Wolff wrote: >>>> [...] >>>>> rd-reparse '\??\C:\tmp\link' ; echo >>>> ReparseTag:           0xa000001d >>>> ReparseDataLength:             8 >>>> Reserved:                      0 >>>> 02 00 00 00 66 69 6c 65 >>>>> rd-reparse '\??\C:\tmp\link-abs' ; echo >>>> ReparseTag:           0xa000001d >>>> ReparseDataLength:            19 >>>> Reserved:                      0 >>>> 02 00 00 00 2f 6d 6e 74 2f 63 2f 74 6d 70 2f 66 >>>> 69 6c 65 >>>>> rd-reparse '\??\C:\tmp\link-foo' ; echo >>>> ReparseTag:           0xa000001d >>>> ReparseDataLength:             9 >>>> Reserved:                      0 >>>> 02 00 00 00 66 c3 b6 c3 b6 >>>>> rd-reparse '\??\C:\tmp\link-foo-abs' ; echo >>>> ReparseTag:           0xa000001d >>>> ReparseDataLength:            20 >>>> Reserved:                      0 >>>> 02 00 00 00 2f 6d 6e 74 2f 63 2f 74 6d 70 2f 66 >>>> c3 b6 c3 b6 >>> [...] >>> I debugged this now and I found that practically all problems, including >>> the inability to delete the symlink, are a result of not being able to >>> open the reparse point correctly as reparse point within Cygwin. So as >>> not to destroy something important, Cygwin only opens reparse points as >>> reparse points if it recognizes the reparse point type. >>> >>> Consequentially, all immediate problems go away, as soon as Cygwin >>> recognizes and handles the symlink :) >>> >>> So I created a patch and pushed it. The latest developer snapshot from >>> https://cygwin.com/snapshots/ contains this patch. >> Works, great, thank you! > Thanks for testing! > >>> Funny sidenote: Assuming you create symlinks pointing to files with >>> non-UTF-8 chars, e. g., umlauts in ISO-8859-1, then the symlink converts >>> *all* these chars to the Unicode REPLACEMENT CHARACTER 0xfffd. I assume >>> this will also happen if you try to create the file with these chars in >>> the first place, so it's not much of a problem. >> As Windows filenames are character strings as opposed to Linux filenames >> which are byte strings, some strange behaviour is unavoidable. I see: >> $ wsl ls -l link_LW >> lrwxrwxrwx    1 towo     towo            19 Mar 27 12:11 link_LW -> >> file_L_ >> $ ls -l link_LW >> lrwxrwxrwx 1 towo Kein 11 27. Mrz 13:11 link_LW -> file_L_���� >> which looks OK for me. > Not sure I expressed myself correctly there. What I was trying to say > is, the symlink created by WSL already contains the 0xfffd replacement > char, in UTF-8 \xef \xbf \xbd. So the info is already lost inside the > symlink. I couldn't create a non-UTF8 file name in WSL on the command line; even running LC_ALL=de_DE mintty and running WSL LC_ALL=de_DE bash, keyboard input would still appear as UTF-8 when displayed with od, which is weird. Anyway, this can be tricked using touch from a script file of course. In that case, indeed WSL flattens all invalid characters to � already for the filename. However, all symbolic link cases work for me. I can point links to file_L_ and file_LW_���� and access the respective files correctly via the links from both WSL and cygwin now. Thomas