public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* cant access to files more than 128 utf-8 symbol long names
@ 2013-12-10  7:16 Nikolay Ilychev
  2013-12-10  9:50 ` Andrey Repin
  2013-12-10 10:28 ` Corinna Vinschen
  0 siblings, 2 replies; 18+ messages in thread
From: Nikolay Ilychev @ 2013-12-10  7:16 UTC (permalink / raw)
  To: cygwin

Hello!

When using cygwin, i can't list, copy, remove files and directories with 
128 utf-8 symbol long names.

useless examples that illustrates the problem:

it is OK with latin symbols:

$ a="$(perl -e 'print "x"x255')"; touch "$a" && { ls "$a"; rm "$a"; }
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

$ a="$(perl -e 'print "x"x256')"; touch "$a" && { ls "$a"; rm "$a"; }
touch: cannot touch 
`xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx': 
File name too long

but if i try with cyrillic, max long is just 127 symbols:

$ a="$(perl -e 'print "\xd0\xaf"x127')"; touch "$a" && { ls "$a"; rm "$a"; }
ЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯ

$ a="$(perl -e 'print "\xd0\xaf"x128')"; touch "$a" && { ls "$a"; rm "$a"; }
touch: cannot touch 
`ЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯ': 
File name too long


but users can create whith cmd.exe or powershell.exe or explorer.exe 
files with 250+ long names with cyrillic symbols:

$ a="$(perl -e 'print "\xd0\xaf"x251')"; cmd /C "echo > $a" && ls -l
ls: cannot access 
ЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯ-: 
No such file or directory
total 0
-????????? ? ? ? ?            ? 
ЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯ?

and with cygwin i have no any access to this files or directories:

$ rm *; ls -l
rm: cannot remove 
`ЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯ\320': 
No such file or directory
ls: cannot access 
ЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯ-: 
No such file or directory
total 0
-????????? ? ? ? ?            ? 
ЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯЯ?

same problem with other tools - find, perl, rsync from cygwin repo.

Please, make the MAX_PATH not for 260 bytes, but 260 utf-8 symbols.

Thanks.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-10  7:16 cant access to files more than 128 utf-8 symbol long names Nikolay Ilychev
@ 2013-12-10  9:50 ` Andrey Repin
  2013-12-10 10:28 ` Corinna Vinschen
  1 sibling, 0 replies; 18+ messages in thread
From: Andrey Repin @ 2013-12-10  9:50 UTC (permalink / raw)
  To: Nikolay Ilychev, cygwin

Greetings, Nikolay Ilychev!

> Please, make the MAX_PATH not for 260 bytes, but 260 utf-8 symbols.

You can't just "make MAX_PATH", this is an Operating System (i.e. Windows, not
Cygwin) constant.


--
WBR,
Andrey Repin (anrdaemon@yandex.ru) 10.12.2013, <13:37>

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-10  7:16 cant access to files more than 128 utf-8 symbol long names Nikolay Ilychev
  2013-12-10  9:50 ` Andrey Repin
@ 2013-12-10 10:28 ` Corinna Vinschen
  2013-12-10 14:48   ` Noel Grandin
                     ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Corinna Vinschen @ 2013-12-10 10:28 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3394 bytes --]

On Dec 10 11:15, Nikolay Ilychev wrote:
> Hello!
> 
> When using cygwin, i can't list, copy, remove files and directories
> with 128 utf-8 symbol long names.
> 
> useless examples that illustrates the problem:
> [...]
> same problem with other tools - find, perl, rsync from cygwin repo.
> 
> Please, make the MAX_PATH not for 260 bytes, but 260 utf-8 symbols.

Easier said than done.

First of all, this is NOT about MAX_PATH.  MAX_PATH (260 chars) is the
number of characters allowed in the Win32 ANSI file API for a complete
path, including the terminating null.  Cygwin is using the native NT API
and, occasionally, the Win32 UNICODE file API, which allows paths of up
to 32767 chars.

The problem here is about NAME_MAX.  NAME_MAX is per POSIX[1] the
"maximum number of bytes in a filename (not including the terminating
null)."

Note the word *bytes*.  Not characters, bytes. UTF-8 chars are 1 to 4
bytes in length.  Thus, the maximum number of UTF-8 chars in a filename
is potentially less than NAME_MAX:

A filename of chars only from the basic latin charset (1 byte in UTF-8)
may consist of NAME_MAX characters, a filename solely constructed from
chars of the latin-1 supplement (2 byte chars) may consist of NAME_MAX /
2 characters, a filename constructed from emoticons (4 byte chars) only
of NAME_MAX / 4 chars.

Ok, so we all know that Windows is not using a byte representation of
filenames, rather the OS uses UTF-16 to store and handle filenames
internally.  Filename on Windows filesystems may consist of 255 UTF-16
chars[2].

How do you represent this in a byte-oriented POSIX system?  What do you
set NAME_MAX to?  You can't get it right due to the unfortunate multibyte
vs. UTF-16 encoding issue.

To cover all UTF-8 chars, NAME_MAX would have to be 1020.  But then,
applications relying on NAME_MAX will be surprised by ENAMETOOLONG
errors for perfectly valid POSIX filenames.

If you make it 255, applications will be surprised by ENAMETOOLONG
errors for perfectly valid Windows filenames.

If you make it 255 on the application level but then return filenames
longer than 255 multibyte chars to the application, they will crash
due to buffer overflow issues.  After all, NAME_MAX is a contractual
obligation.

There was also the backward compatibility issue.  Back in the pre-Cygwin
1.7 days, when Cygwin used the ANSI file API, NAME_MAX was already 255.
Changing that to a bigger value might have resulted in the
aforementioned application crashes due to buffer overflows as well.

So we decided to keep NAME_MAX at the same value as it always was, 255.
This restricts the actual filename length when using multibyte
characters just as on any other POSIX system with the downside that,
occasionally, a Windows filename will be too long to handle.

Sorry if that is frustrating in your current situation, but this
isn't something we can just change at a whim and go ahead.  It would
break compatibility with all existing Cygwin executables.


Corinna


[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/limits.h.html
[2] However, this does *not* cover NFS or other filesystems using a
    byte representation for storing filenames.


-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-10 10:28 ` Corinna Vinschen
@ 2013-12-10 14:48   ` Noel Grandin
  2013-12-10 15:33     ` Corinna Vinschen
  2013-12-11  7:05   ` Andrey Repin
  2013-12-11 13:49   ` Mikhail Usenko
  2 siblings, 1 reply; 18+ messages in thread
From: Noel Grandin @ 2013-12-10 14:48 UTC (permalink / raw)
  To: cygwin

On 2013-12-10 12:27, Corinna Vinschen wrote:
> Sorry if that is frustrating in your current situation, but this isn't something we can just change at a whim and go 
> ahead. It would break compatibility with all existing Cygwin executables.

Maybe this is something that could be fixed only in the 64-bit version of Cygwin?

That would limit the compatibility damage.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-10 14:48   ` Noel Grandin
@ 2013-12-10 15:33     ` Corinna Vinschen
  2013-12-10 16:29       ` Christopher Faylor
  2013-12-11  9:20       ` Andrey Repin
  0 siblings, 2 replies; 18+ messages in thread
From: Corinna Vinschen @ 2013-12-10 15:33 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

On Dec 10 16:48, Noel Grandin wrote:
> On 2013-12-10 12:27, Corinna Vinschen wrote:
> >Sorry if that is frustrating in your current situation, but this
> >isn't something we can just change at a whim and go ahead. It
> >would break compatibility with all existing Cygwin executables.
> 
> Maybe this is something that could be fixed only in the 64-bit version of Cygwin?

Did you really read my mail?  There is no fix.  You can handle this
wrongly one way or the other.  If in doubt, I prefer the POSIXly correct
way.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-10 15:33     ` Corinna Vinschen
@ 2013-12-10 16:29       ` Christopher Faylor
  2013-12-11  9:20       ` Andrey Repin
  1 sibling, 0 replies; 18+ messages in thread
From: Christopher Faylor @ 2013-12-10 16:29 UTC (permalink / raw)
  To: cygwin

On Tue, Dec 10, 2013 at 04:32:59PM +0100, Corinna Vinschen wrote:
>On Dec 10 16:48, Noel Grandin wrote:
>>On 2013-12-10 12:27, Corinna Vinschen wrote:
>>>Sorry if that is frustrating in your current situation, but this isn't
>>>something we can just change at a whim and go ahead.  It would break
>>>compatibility with all existing Cygwin executables.
>>
>>Maybe this is something that could be fixed only in the 64-bit version
>>of Cygwin?
>
>Did you really read my mail?  There is no fix.  You can handle this
>wrongly one way or the other.  If in doubt, I prefer the POSIXly
>correct way.

Also, even if it was something that could be "fixed", since the 64-bit
version of Cygwin has been out for some time now, breaking backwards
compatibility would still be bad.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-10 10:28 ` Corinna Vinschen
  2013-12-10 14:48   ` Noel Grandin
@ 2013-12-11  7:05   ` Andrey Repin
  2013-12-11 11:24     ` Corinna Vinschen
  2013-12-11 13:49   ` Mikhail Usenko
  2 siblings, 1 reply; 18+ messages in thread
From: Andrey Repin @ 2013-12-11  7:05 UTC (permalink / raw)
  To: Corinna Vinschen

Greetings, Corinna Vinschen!

> The problem here is about NAME_MAX.  NAME_MAX is per POSIX[1] the
> "maximum number of bytes in a filename (not including the terminating
> null)."

Does this mean that POSIX standard is not compatible with real life?
No surprise I was having hard times copying a rather simple directory
structure to a UNIX servers. Just 2 levels deep with 4-5 words in each
element name.

> Note the word *bytes*.  Not characters, bytes. UTF-8 chars are 1 to 4
> bytes in length.  Thus, the maximum number of UTF-8 chars in a filename
> is potentially less than NAME_MAX:

> A filename of chars only from the basic latin charset (1 byte in UTF-8)
> may consist of NAME_MAX characters, a filename solely constructed from
> chars of the latin-1 supplement (2 byte chars) may consist of NAME_MAX /
> 2 characters, a filename constructed from emoticons (4 byte chars) only
> of NAME_MAX / 4 chars.

> Ok, so we all know that Windows is not using a byte representation of
> filenames, rather the OS uses UTF-16 to store and handle filenames
> internally.  Filename on Windows filesystems may consist of 255 UTF-16
> chars[2].

> How do you represent this in a byte-oriented POSIX system?  What do you
> set NAME_MAX to?  You can't get it right due to the unfortunate multibyte
> vs. UTF-16 encoding issue.

> To cover all UTF-8 chars, NAME_MAX would have to be 1020.  But then,
> applications relying on NAME_MAX will be surprised by ENAMETOOLONG
> errors for perfectly valid POSIX filenames.

> If you make it 255, applications will be surprised by ENAMETOOLONG
> errors for perfectly valid Windows filenames.

> If you make it 255 on the application level but then return filenames
> longer than 255 multibyte chars to the application, they will crash
> due to buffer overflow issues.  After all, NAME_MAX is a contractual
> obligation.

> There was also the backward compatibility issue.  Back in the pre-Cygwin
> 1.7 days, when Cygwin used the ANSI file API, NAME_MAX was already 255.
> Changing that to a bigger value might have resulted in the
> aforementioned application crashes due to buffer overflows as well.

> So we decided to keep NAME_MAX at the same value as it always was, 255.
> This restricts the actual filename length when using multibyte
> characters just as on any other POSIX system with the downside that,
> occasionally, a Windows filename will be too long to handle.

> Sorry if that is frustrating in your current situation, but this
> isn't something we can just change at a whim and go ahead.  It would
> break compatibility with all existing Cygwin executables.


> Corinna


> [1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/limits.h.html
> [2] However, this does *not* cover NFS or other filesystems using a
>     byte representation for storing filenames.




--
WBR,
Andrey Repin (anrdaemon@yandex.ru) 11.12.2013, <10:55>

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-10 15:33     ` Corinna Vinschen
  2013-12-10 16:29       ` Christopher Faylor
@ 2013-12-11  9:20       ` Andrey Repin
  1 sibling, 0 replies; 18+ messages in thread
From: Andrey Repin @ 2013-12-11  9:20 UTC (permalink / raw)
  To: Corinna Vinschen

Greetings, Corinna Vinschen!

>> >Sorry if that is frustrating in your current situation, but this
>> >isn't something we can just change at a whim and go ahead. It
>> >would break compatibility with all existing Cygwin executables.
>> 
>> Maybe this is something that could be fixed only in the 64-bit version of Cygwin?

> Did you really read my mail?  There is no fix.  You can handle this
> wrongly one way or the other.  If in doubt, I prefer the POSIXly correct
> way.

After off-list discussion, Nikolay partially solved this issue by using
locale-appropriate single-byte encoding in LANG.
In this case,

LANG=ru_RU.CP1251

It is far from a perfect solution, but at least let him access the files
in question.


--
WBR,
Andrey Repin (anrdaemon@yandex.ru) 11.12.2013, <13:12>

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-11  7:05   ` Andrey Repin
@ 2013-12-11 11:24     ` Corinna Vinschen
  0 siblings, 0 replies; 18+ messages in thread
From: Corinna Vinschen @ 2013-12-11 11:24 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]

On Dec 11 11:04, Andrey Repin wrote:
> Greetings, Corinna Vinschen!
> 
> > The problem here is about NAME_MAX.  NAME_MAX is per POSIX[1] the
> > "maximum number of bytes in a filename (not including the terminating
> > null)."
> 
> Does this mean that POSIX standard is not compatible with real life?

Are you asking nonsensical questions for fun or did you not read my mail
closely, too?  I made the effort to reply to the OP with a detailed mail
explaining the issue.  I don't understand what this sniding reaction is
supposed to accomplish.  POSIX and Windows are not naturally compatible.
Cygwin tries hard to bridge the gap, but sometimes the gap is really
wide.

But thanks anyway for providing a solution to the problem by setting the
locale environment variables.  That might make a good FAQ entry, *iff*
somebody has the incentive to write one.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-10 10:28 ` Corinna Vinschen
  2013-12-10 14:48   ` Noel Grandin
  2013-12-11  7:05   ` Andrey Repin
@ 2013-12-11 13:49   ` Mikhail Usenko
  2013-12-11 14:09     ` Corinna Vinschen
  2 siblings, 1 reply; 18+ messages in thread
From: Mikhail Usenko @ 2013-12-11 13:49 UTC (permalink / raw)
  To: cygwin

On Tue, 10 Dec 2013 11:27:55 +0100
Corinna Vinschen <...> wrote:


> Easier said than done.
> 
> Cygwin is using the native NT API
> and, occasionally, the Win32 UNICODE file API, which allows paths of up
> to 32767 chars.
> ...
> How do you represent this in a byte-oriented POSIX system?  What do you
> set NAME_MAX to?  You can't get it right due to the unfortunate multibyte
> vs. UTF-16 encoding issue.
> 
> To cover all UTF-8 chars, NAME_MAX would have to be 1020.  But then,
> applications relying on NAME_MAX will be surprised by ENAMETOOLONG
> errors for perfectly valid POSIX filenames.
> 
> If you make it 255, applications will be surprised by ENAMETOOLONG
> errors for perfectly valid Windows filenames.
> 

Strictly speaking, the NAME_MAX and PATH_MAX POSIX' limits must be 32767*4 bytes, that is ~128K on Windows systems. With such a value no one Cygwin application running on the Windows does not come across the ENAMETOOLONG error because of the nonexistence of the actual filenames with this length (and hence POSIX filenames too). Did I understand rigth?

-- 


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-11 13:49   ` Mikhail Usenko
@ 2013-12-11 14:09     ` Corinna Vinschen
  2013-12-11 15:03       ` Mikhail Usenko
  0 siblings, 1 reply; 18+ messages in thread
From: Corinna Vinschen @ 2013-12-11 14:09 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1847 bytes --]

On Dec 11 17:49, Mikhail Usenko wrote:
> On Tue, 10 Dec 2013 11:27:55 +0100
> Corinna Vinschen <...> wrote:
> 
> 
> > Easier said than done.
> > 
> > Cygwin is using the native NT API
> > and, occasionally, the Win32 UNICODE file API, which allows paths of up
> > to 32767 chars.
> > ...
> > How do you represent this in a byte-oriented POSIX system?  What do you
> > set NAME_MAX to?  You can't get it right due to the unfortunate multibyte
> > vs. UTF-16 encoding issue.
> > 
> > To cover all UTF-8 chars, NAME_MAX would have to be 1020.  But then,
> > applications relying on NAME_MAX will be surprised by ENAMETOOLONG
> > errors for perfectly valid POSIX filenames.
> > 
> > If you make it 255, applications will be surprised by ENAMETOOLONG
> > errors for perfectly valid Windows filenames.
> > 
> 
> Strictly speaking, the NAME_MAX and PATH_MAX POSIX' limits must be
> 32767*4 bytes, that is ~128K on Windows systems. With such a value no

Strictly speaking you're wrong.  NAME_MAX is the length of a single
path component, not the length of a path:

     NAME_MAX
       vvv
  /foo/bar/baz\0
  ^^^^^^^^^^^^^^
     PATH_MAX

Also, PATH_MAX is NOT the maximum length of a path, but the

  "Maximum number of bytes the implementation will store as a pathname
   in a user-supplied buffer of unspecified size, including the
   terminating null character."

That does not mean there are no longer paths possible, just that you
have to use, for instance, relative paths rather than absolute paths, if
the absolute path becomes longer than PATH_MAX, and that the system
does not guarantee to return paths if they are longer then PATH_MAX.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-11 14:09     ` Corinna Vinschen
@ 2013-12-11 15:03       ` Mikhail Usenko
  2013-12-11 15:23         ` Corinna Vinschen
  0 siblings, 1 reply; 18+ messages in thread
From: Mikhail Usenko @ 2013-12-11 15:03 UTC (permalink / raw)
  To: cygwin

I couldn't figure out how a POSIX filename passed to a Cygwin application running on the Windows system may become longer than NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255 UTF-16 symbols (i.e. 1020 bytes for the biggest 4 byte UTF-8 code unit)? 
What causes the ENAMETOOLONG error? In the most of POSIX functions ENAMETOOLONG is returned if the length of a component of a pathname is longer than {NAME_MAX} or the length of a pathname exceeds {PATH_MAX}. On NTFS there is no files with pathname component longer than 1020 bytes and the length of the full pathname is limited by the Unicode API (32767 chars * 4 byte = 128KiB).
-- 


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-11 15:03       ` Mikhail Usenko
@ 2013-12-11 15:23         ` Corinna Vinschen
  2013-12-11 15:28           ` Mikhail Usenko
  0 siblings, 1 reply; 18+ messages in thread
From: Corinna Vinschen @ 2013-12-11 15:23 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 526 bytes --]

On Dec 11 19:02, Mikhail Usenko wrote:
> I couldn't figure out how a POSIX filename passed to a Cygwin
> application running on the Windows system may become longer than
> NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255
> UTF-16 symbols (i.e. 1020 bytes for the biggest 4 byte UTF-8 code
> unit)? 

Read my mail again.  NAME_MAX is 255.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-11 15:23         ` Corinna Vinschen
@ 2013-12-11 15:28           ` Mikhail Usenko
  2013-12-11 16:21             ` Corinna Vinschen
  0 siblings, 1 reply; 18+ messages in thread
From: Mikhail Usenko @ 2013-12-11 15:28 UTC (permalink / raw)
  To: cygwin

On Wed, 11 Dec 2013 16:23:39 +0100
Corinna Vinschen <...> wrote:

> On Dec 11 19:02, Mikhail Usenko wrote:
> > I couldn't figure out how a POSIX filename passed to a Cygwin
> > application running on the Windows system may become longer than
> > NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255
> > UTF-16 symbols (i.e. 1020 bytes for the biggest 4 byte UTF-8 code
> > unit)? 
> 
> Read my mail again.  NAME_MAX is 255.
> 
> 
> Corinna

Corinna, why not 1020?


-- 


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-11 15:28           ` Mikhail Usenko
@ 2013-12-11 16:21             ` Corinna Vinschen
  2013-12-11 16:30               ` Christopher Faylor
  0 siblings, 1 reply; 18+ messages in thread
From: Corinna Vinschen @ 2013-12-11 16:21 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 768 bytes --]

On Dec 11 19:27, Mikhail Usenko wrote:
> On Wed, 11 Dec 2013 16:23:39 +0100
> Corinna Vinschen <...> wrote:
> 
> > On Dec 11 19:02, Mikhail Usenko wrote:
> > > I couldn't figure out how a POSIX filename passed to a Cygwin
> > > application running on the Windows system may become longer than
> > > NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255
> > > UTF-16 symbols (i.e. 1020 bytes for the biggest 4 byte UTF-8 code
> > > unit)? 
> > 
> > Read my mail again.  NAME_MAX is 255.
> > 
> > 
> > Corinna
> 
> Corinna, why not 1020?

That's answered in my original mail.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-11 16:21             ` Corinna Vinschen
@ 2013-12-11 16:30               ` Christopher Faylor
  2013-12-11 17:01                 ` Corinna Vinschen
  0 siblings, 1 reply; 18+ messages in thread
From: Christopher Faylor @ 2013-12-11 16:30 UTC (permalink / raw)
  To: cygwin

On Wed, Dec 11, 2013 at 05:21:37PM +0100, Corinna Vinschen wrote:
>On Dec 11 19:27, Mikhail Usenko wrote:
>> On Wed, 11 Dec 2013 16:23:39 +0100
>> Corinna Vinschen <...> wrote:
>> 
>> > On Dec 11 19:02, Mikhail Usenko wrote:
>> > > I couldn't figure out how a POSIX filename passed to a Cygwin
>> > > application running on the Windows system may become longer than
>> > > NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255
>> > > UTF-16 symbols (i.e. 1020 bytes for the biggest 4 byte UTF-8 code
>> > > unit)? 
>> > 
>> > Read my mail again.  NAME_MAX is 255.
>> > 
>> > 
>> > Corinna
>> 
>> Corinna, why not 1020?
>
>That's answered in my original mail.

Perhaps this will require reiteration and reclarification on Thursday,
feline-permitting.

YMMV!

cgf

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-11 16:30               ` Christopher Faylor
@ 2013-12-11 17:01                 ` Corinna Vinschen
  2013-12-11 17:49                   ` Christopher Faylor
  0 siblings, 1 reply; 18+ messages in thread
From: Corinna Vinschen @ 2013-12-11 17:01 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1113 bytes --]

On Dec 11 11:30, Christopher Faylor wrote:
> On Wed, Dec 11, 2013 at 05:21:37PM +0100, Corinna Vinschen wrote:
> >On Dec 11 19:27, Mikhail Usenko wrote:
> >> On Wed, 11 Dec 2013 16:23:39 +0100
> >> Corinna Vinschen <...> wrote:
> >> 
> >> > On Dec 11 19:02, Mikhail Usenko wrote:
> >> > > I couldn't figure out how a POSIX filename passed to a Cygwin
> >> > > application running on the Windows system may become longer than
> >> > > NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255
> >> > > UTF-16 symbols (i.e. 1020 bytes for the biggest 4 byte UTF-8 code
> >> > > unit)? 
> >> > 
> >> > Read my mail again.  NAME_MAX is 255.
> >> > 
> >> > 
> >> > Corinna
> >> 
> >> Corinna, why not 1020?
> >
> >That's answered in my original mail.
> 
> Perhaps this will require reiteration and reclarification on Thursday,
> feline-permitting.

And it's not even my WJM week.  Can we move that to Thursday next week?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cant access to files more than 128 utf-8 symbol long names
  2013-12-11 17:01                 ` Corinna Vinschen
@ 2013-12-11 17:49                   ` Christopher Faylor
  0 siblings, 0 replies; 18+ messages in thread
From: Christopher Faylor @ 2013-12-11 17:49 UTC (permalink / raw)
  To: cygwin

On Wed, Dec 11, 2013 at 06:01:03PM +0100, Corinna Vinschen wrote:
>>Perhaps this will require reiteration and reclarification on Thursday,
>>feline-permitting.
>
>And it's not even my WJM week.  Can we move that to Thursday next week?

Sorry, no.  I can't allow that.  But, then, it's my week.

cgf

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-12-11 17:49 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-10  7:16 cant access to files more than 128 utf-8 symbol long names Nikolay Ilychev
2013-12-10  9:50 ` Andrey Repin
2013-12-10 10:28 ` Corinna Vinschen
2013-12-10 14:48   ` Noel Grandin
2013-12-10 15:33     ` Corinna Vinschen
2013-12-10 16:29       ` Christopher Faylor
2013-12-11  9:20       ` Andrey Repin
2013-12-11  7:05   ` Andrey Repin
2013-12-11 11:24     ` Corinna Vinschen
2013-12-11 13:49   ` Mikhail Usenko
2013-12-11 14:09     ` Corinna Vinschen
2013-12-11 15:03       ` Mikhail Usenko
2013-12-11 15:23         ` Corinna Vinschen
2013-12-11 15:28           ` Mikhail Usenko
2013-12-11 16:21             ` Corinna Vinschen
2013-12-11 16:30               ` Christopher Faylor
2013-12-11 17:01                 ` Corinna Vinschen
2013-12-11 17:49                   ` Christopher Faylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).