From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from plutone.assyoma.it (cloud.assyoma.it [212.237.56.195]) by sourceware.org (Postfix) with ESMTPS id 74AC33858C2C; Mon, 17 Apr 2023 13:46:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 74AC33858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=assyoma.it Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=assyoma.it Received: from webmail.assyoma.it (localhost [IPv6:::1]) by plutone.assyoma.it (Postfix) with ESMTPA id F3342C07410E; Mon, 17 Apr 2023 15:46:52 +0200 (CEST) MIME-Version: 1.0 Date: Mon, 17 Apr 2023 15:46:52 +0200 From: Gionatan Danti To: cygwin@cygwin.com Cc: Corinna Vinschen Subject: Re: Can not stat file with utf char U+F020 In-Reply-To: References: <992b3c28d7f1cfc17f7c9bb47b53f770@assyoma.it> <1274a3199d9bedab4f15d209694c6e1f@assyoma.it> Message-ID: X-Sender: g.danti@assyoma.it Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,KAM_NUMSUBJECT,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Il 2023-04-17 11:05 Corinna Vinschen ha scritto: > It's actually not the "dos" mount option but specific filesystems > which trigger the conversion from U+0020 to U+F020. OK. > However, the conversion back is handled in a piece of code which has > no information about the underlying filesystem, so the F0xx -> 00xx > conversion is done all the time. Adding filesystem info in this > place is really tricky. Ah, I missed it, thanks! With these new information, I did some progress. First, I use the "dos" mount option to always trigger conversion of space and dot at filename end into F+00xx chars. Now I am able to create such strange-looking file (in Explorer) within cygwin itself. For example, touch "zzs " now results in "zzs+strangechar" in Explorer. Both cygwin and windows are able to read/write such file. But if I edit the filename via Explorer adding an extension (ie: from "zzs+strangechar" to "zzs+strangechar.txt") now cygwin is suddenly unable to read/write the file. It seems to me that the appended chars prevent cygwin to translate back F0xx to 00xx (as the PUA char is not at the end of the filename anymore). So, two paths should be available: - always translate back F0xx to 00xx even if not at the end of filename; - otherwise, if too invasive to do it unconditionally, add an option as "always_translate_pua" (default: off) to enable such behavior based on user needs. I would (naively?) think that option 1 (always translate back PUA) should be the preferred approach, as cygwin is at the moment effectively unable to access some files. Regards. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti@assyoma.it - info@assyoma.it GPG public key ID: FF5F32A8