On 2021-06-25 19:53, Vadim wrote: > Ah, this beautiful topic. Windows 7 x64. > > This is the summary written as post-scriptum, tests and findings below: > > 1) Cygwin limits individual names to 255 bytes, Windows seems to follow > UTF-16 chars and work fine: 256 bytes in 108 characters works. > > Basically, this becomes a bytes vs characters story. > > 2) Bash file name auto-expansion detects the file of that name, but it > gets truncated to 255 bytes. find's behaviour is the same ("No such file > or directory" due to trying to access a non-existing truncated name) > > 2.1) If you try to correct the above mistake by adding truncated > characters, then the program (cat) will complain about "File name too long" > > 2.2) If there exists a folder with a 255-byte name, equal to the > truncated name, then "find ." will do a listing on that folder twice > (effectively hiding the long-named folder from tools without leaving an > error message) > > 3) UNC Paths get the same treatment: File name too long. > > I expected Cygwin to handle these names without problems just like > Windows, Explorer, cmd etc. do. Is this particular problem new or known? > All I could find on the mailing list is around the time when Cygwin > hadn't yet implemented Unicode support (UTF-8?), ~2004-2008. > > These names were created by youtube-dl.exe executed from within Cygwin. > This file name is 255 bytes long and works: > > s123點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫骨場 O記 > 轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾慌忙逃 > 走︱蘋果日報 Apple Daily #香港新聞.txt > > This is 256 bytes and works perfectly normal in Windows (explorer, can > paste and "dir " in cmd despite showing [] block chars), but not > Cygwin terminal (I used s123/s1234 as a prefix for easy auto-expansion): > > s1234點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫骨場 O > 記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾慌忙 > 逃走︱蘋果日報 Apple Daily #香港新聞.txt > > > If I try to use tab-expansion in the terminal (mintty, bash) the problem > becomes apparent ("xt" missing at the end): > > $ cat s1234點半蘋果新聞報道\ 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫 > 骨場\ O記轉介律政司︱新巴車長被判不小心駕駛罪成 ︱深圳賽格大樓離奇劇晃\ > 民眾慌忙逃走︱蘋果日報\ Apple\ Daily\ #香港新聞.t > cat: 's1234點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫 > 骨場 O記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民 > 眾慌忙逃走︱蘋果日報 Apple Daily #香港新聞.t': No such file or directory > > > However, with one fewer byte it expands properly: > > $ cat s123點半蘋果新聞報道\ 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫 > 骨場\ O記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃\ > 民眾慌忙逃走︱蘋果日報\ Apple\ Daily\ #香港新聞.txt > hello > > > MAX_PATH? Yes, 255 bytes. Why then does the full file/folder name work > in Windows? This is the full name (a folder), 257 bytes: > > 20210518_9點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫骨 > 場 O記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾 > 慌忙逃走︱蘋果日報 Apple Daily #香港新聞 > > And it can get longer! In fact, I can bump the total path to 396 bytes > or "Column 249" as Notepad++ counts the characters (individual folder > name is 359b or 211 chars, "column 212"): > > D:/abcdefgh/Local_TEMP/cygwinunicode > /1_123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789020210518_9 > 點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫骨場 O記轉介 > 律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾慌忙逃走︱ > 蘋果日報 Apple Daily #香港新聞 > > > NTFS allows up to 255 UTF-16 for an individual path segment and this > seems to align with the Windows tooling: cmd and Explorer can browse > these fine, but the included file in the folder spills beyond the limit > and you run into the usual 'total path too long' problem). > > Whether you manually add the missing "xt" to the tab-completion or use > UNC paths, the result is the same: > > $ cat s1234點半蘋果新聞報道\ 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫 > 骨場\ O記轉介律政司︱新巴車長被判不小心駕駛罪成 ︱深圳賽格大樓離奇劇晃\ > 民眾慌忙逃走︱蘋果日報\ Apple\ Daily\ #香港新聞.txt > cat: 's1234點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫 > 骨場 O記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民 > 眾慌忙逃走︱蘋果日報 Apple Daily #香港新聞.txt': File name too long > $ cat '\\?\D:\abcdefgh\Local_TEMP\cygwinunicode\20210518_9點半蘋果新聞報 > 道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫骨場 O記轉介律政司︱新巴車 > 長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾慌忙逃走︱蘋果日報 Apple > Daily #香港新聞.txt' > cat: '\\?\D:\abcdefgh\Local_TEMP\cygwinunicode\20210518_9點半蘋果新聞報 > 道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫骨場 O記轉介律政司︱新巴車 > 長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾慌忙逃走︱蘋果日報 Apple > Daily #香港新聞.txt': File name too long Filename 113 characters, 261 bytes: $ wc -lwmcL <<< '20210518_9點半蘋果新聞報道 字幕版重溫(2021年5月18日) ︱蔡展鵬光顧賣淫骨場 O記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格 大樓離奇劇晃 民眾慌忙逃走︱蘋果日報 Apple Daily #香港新聞.txt' 1 7 114 262 187 $ strace -o touch.strace /usr/bin/touch '20210518_9點半蘋果新聞報道 字幕 版重溫(2021年5月18日)︱蔡展鵬光顧賣淫骨場 O記轉介律政司︱新巴車長被判 不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾慌忙逃走︱蘋果日報 Apple Daily #香 港新聞.txt' /usr/bin/touch: cannot touch '20210518_9點半蘋果新聞報道 字幕版重溫 (2021年5月18日)︱蔡展鵬光顧賣淫骨場 O記轉介律政司︱ 新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾慌忙逃走︱蘋果日報 Apple Daily #香港新聞.txt': File name too long Trim 2 leading and 4 trailing bytes and it works: /usr/bin/touch '210518_9點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡 展鵬光顧賣淫骨場 O記轉介 律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大 樓離奇劇晃 民眾慌忙逃走︱蘋果日報 Apple Daily #香港新聞' $ l '210518_9點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣 淫骨場 O記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾慌忙逃走︱蘋果日報 Apple Daily #香港新聞' '210518_9點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫骨 場 O記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民眾 慌忙逃走︱蘋果日報 Apple Daily #香港新聞' Attached sanitized excerpt from strace of failure case, showing: 168 108424 [main] touch 38975 path_conv::check: this->path($HOME/20210518_9點半蘋果新聞報道 字幕版重溫(2021年5月18日) ︱蔡展鵬光顧賣淫骨場 O記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格 大樓離奇劇晃 民眾慌忙逃走︱蘋果日報 Apple Daily #香港新聞.txt), has_acls(1) 45 108469 [main] touch 38975 __set_errno: int utimens_worker(path_conv&, const timespec*):345 setting errno 91 46 108515 [main] touch 38975 utimens_worker: -1 = utimes(/??/$HOME /20210518_9點半蘋果新聞報道 字幕版重溫(2021年5月18日)︱蔡展鵬光顧賣淫 骨場 O記轉介律政司︱新巴車長被判不小心駕駛罪成︱深圳賽格大樓離奇劇晃 民 眾慌忙逃走︱蘋果日報 Apple Daily #香港新聞.txt, 0x0), errno 91 which appears to show that times.cc(utimens_worker) gets a zero return value from dtable.cc(build_fh_pc) which has done something which sets errno ENAMETOOLONG (91). -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in binary units and prefixes, physical quantities in SI.]