public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Accessing filenames with different charsets
@ 2002-07-01  1:59 Ville Herva
  2002-07-02 14:50 ` Ville Herva
  0 siblings, 1 reply; 15+ messages in thread
From: Ville Herva @ 2002-07-01  1:59 UTC (permalink / raw)
  To: cygwin

Sorry if this has already been discussed, but I couldn't find it in the
archive nor in the FAQ...

If I have a file name with Russian characters in it, cygwin is unable to
access it:

> ls 
????.TEST

(Russian characters are shown as '?' in directory listing, but ls does find
the file).

If I try to access it, however, open fails:

> touch *
touch: '????.TEST': no such file or directory

same deal with less, cp, rm, rsync etc.

On NT4 and W2k, the same happens even with the euro character ('¤') - on XP
it works, but russian chars (among other, I guess) fail. On XP I have the
newest cygwin.dll, so that can make difference as well.

BTW: dir /x show an interesting short name for the file: F305~1.TES, cygpath
-w -s doesn't shown anything.

So is it possible to access these files (I'd like to be able to backup the
workstation via cygwin tools), or is the problem fundamental?


-- v --

v@iki.fi

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Accessing filenames with different charsets
  2002-07-01  1:59 Accessing filenames with different charsets Ville Herva
@ 2002-07-02 14:50 ` Ville Herva
  2002-07-02 14:59   ` Chris January
  0 siblings, 1 reply; 15+ messages in thread
From: Ville Herva @ 2002-07-02 14:50 UTC (permalink / raw)
  To: cygwin

On Mon, Jul 01, 2002 at 11:58:51AM +0300, you [Ville Herva] wrote:
> Sorry if this has already been discussed, but I couldn't find it in the
> archive nor in the FAQ...
> 
> If I have a file name with Russian characters in it, cygwin is unable to
> access it:
> 
> > ls 
> ????.TEST
> 
> (Russian characters are shown as '?' in directory listing, but ls does find
> the file).
> 
> If I try to access it, however, open fails:
> 
> > touch *
> touch: '????.TEST': no such file or directory
> 
> same deal with less, cp, rm, rsync etc.

Okay, it seems cygwin readdir() returns the filenames as "????.TEST" (where
?:s are really ?:s (ascii 0x3f)). Looking at fhandler_disk_file.cc, this
can't be caused by much else than by FindFirstFileA() returning "????.TEST".
And indeed, if made a little non-unicode test program, that called
FindFirstFile, and it returned "????.TEST" ("\0x3f\0x3f\0x3f\0x3f.TEST").

To access the file, the wide char versions of Find*File() functions would
propably have to be used (or is there another way?). I can't no idea how this
could be integrated into the cygwin framework... 

Any ideas?


-- v --

v@iki.fi

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Accessing filenames with different charsets
  2002-07-02 14:50 ` Ville Herva
@ 2002-07-02 14:59   ` Chris January
  2002-07-02 15:05     ` Ville Herva
  2002-07-03  2:48     ` Ville Herva
  0 siblings, 2 replies; 15+ messages in thread
From: Chris January @ 2002-07-02 14:59 UTC (permalink / raw)
  To: cygwin, v

> > Sorry if this has already been discussed, but I couldn't find it in the
> > archive nor in the FAQ...
> >
> > If I have a file name with Russian characters in it, cygwin is unable to
> > access it:
> >
> > > ls
> > ????.TEST
> >
> > (Russian characters are shown as '?' in directory listing, but ls does
find
> > the file).
> >
> > If I try to access it, however, open fails:
> >
> > > touch *
> > touch: '????.TEST': no such file or directory
> >
> > same deal with less, cp, rm, rsync etc.
>
> Okay, it seems cygwin readdir() returns the filenames as "????.TEST"
(where
> ?:s are really ?:s (ascii 0x3f)). Looking at fhandler_disk_file.cc, this
> can't be caused by much else than by FindFirstFileA() returning
"????.TEST".
> And indeed, if made a little non-unicode test program, that called
> FindFirstFile, and it returned "????.TEST" ("\0x3f\0x3f\0x3f\0x3f.TEST").
>
> To access the file, the wide char versions of Find*File() functions would
> propably have to be used (or is there another way?). I can't no idea how
this
> could be integrated into the cygwin framework...
>
> Any ideas?
Qt (from Trolltech) encodes Unicode filenames before they are used. In
Cygwin we could do the reverse, i.e. use Find*FileW and then encode the
Unicode as a local ANSI string. If we do the encoding manually in Cygwin,
rather than let Windows do it for us, this would overcome the problem. I
will try to put together a patch for this that you can test. One possibility
is to encode Unicode strings as UTF-8.

Chris



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Accessing filenames with different charsets
  2002-07-02 14:59   ` Chris January
@ 2002-07-02 15:05     ` Ville Herva
  2002-07-03  2:48     ` Ville Herva
  1 sibling, 0 replies; 15+ messages in thread
From: Ville Herva @ 2002-07-02 15:05 UTC (permalink / raw)
  To: cygwin

On Tue, Jul 02, 2002 at 10:50:31PM +0100, you [Chris January] wrote:
> Qt (from Trolltech) encodes Unicode filenames before they are used. In
> Cygwin we could do the reverse, i.e. use Find*FileW and then encode the
> Unicode as a local ANSI string. If we do the encoding manually in Cygwin,
> rather than let Windows do it for us, this would overcome the problem. I
> will try to put together a patch for this that you can test. One possibility
> is to encode Unicode strings as UTF-8.

It sounds complicated, but if you are really willing to make a patch, I will
certainly test it.

This might be a good chance to take care of the over 255 char long filenames
that ISTR are only accessible via Find*FileW and the (ugly) \\.\ hack. I
recall once having tried to access such a beast via cygwin, but it failed
(most propably for the same reason as the differing charset ones.)


-- v --

v@iki.fi

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Accessing filenames with different charsets
  2002-07-02 14:59   ` Chris January
  2002-07-02 15:05     ` Ville Herva
@ 2002-07-03  2:48     ` Ville Herva
  2002-07-03  3:18       ` Chris January
  1 sibling, 1 reply; 15+ messages in thread
From: Ville Herva @ 2002-07-03  2:48 UTC (permalink / raw)
  To: cygwin

On Tue, Jul 02, 2002 at 10:50:31PM +0100, you [Chris January] wrote:
> Qt (from Trolltech) encodes Unicode filenames before they are used. In
> Cygwin we could do the reverse, i.e. use Find*FileW and then encode the
> Unicode as a local ANSI string. If we do the encoding manually in Cygwin,
> rather than let Windows do it for us, this would overcome the problem. I
> will try to put together a patch for this that you can test. One possibility
> is to encode Unicode strings as UTF-8.

Another idea that comes into mind: use the cAlternateFileName field from
WIN32_FIND_DATA - that is, the 8.3 filename. I tried it, and I can access
the file via it's 8.3 name in cygwin:

  wc F305~1.TES
    318    1214   10141 F305~1.TES

So all that'd have to be done is make cygwin readdir (and friends) return
the 8.3 name if the normal name is inaccessible (different charset, too long
name... I'm not yet sure how to detect this).

The advantage over encoding the wide char name somehow is that the 8.3 name
is usable in DOS/windows as well. The disadvantage is that a name like
F305~1 doesn't really tell anything about the real filename. And, if you
back it up (say, with tar), and then restore it, you lose the original name.
While not perfect that's still better than losing the whole file.


-- v --

v@iki.fi

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Accessing filenames with different charsets
  2002-07-03  2:48     ` Ville Herva
@ 2002-07-03  3:18       ` Chris January
  2002-07-03  3:30         ` Ville Herva
  0 siblings, 1 reply; 15+ messages in thread
From: Chris January @ 2002-07-03  3:18 UTC (permalink / raw)
  To: Ville Herva; +Cc: cygwin

> > Qt (from Trolltech) encodes Unicode filenames before they are used. In
> > Cygwin we could do the reverse, i.e. use Find*FileW and then encode the
> > Unicode as a local ANSI string. If we do the encoding manually in
Cygwin,
> > rather than let Windows do it for us, this would overcome the problem. I
> > will try to put together a patch for this that you can test. One
possibility
> > is to encode Unicode strings as UTF-8.
>
> Another idea that comes into mind: use the cAlternateFileName field from
> WIN32_FIND_DATA - that is, the 8.3 filename. I tried it, and I can access
> the file via it's 8.3 name in cygwin:
>
>   wc F305~1.TES
>     318    1214   10141 F305~1.TES
>
> So all that'd have to be done is make cygwin readdir (and friends) return
> the 8.3 name if the normal name is inaccessible (different charset, too
long
> name... I'm not yet sure how to detect this).
>
> The advantage over encoding the wide char name somehow is that the 8.3
name
> is usable in DOS/windows as well. The disadvantage is that a name like
> F305~1 doesn't really tell anything about the real filename. And, if you
> back it up (say, with tar), and then restore it, you lose the original
name.
> While not perfect that's still better than losing the whole file.
I wrote a patch for Cygwin yesterday that converts Unicode filenames to UTF8
and back for some file operations. This should do what you want and allow
you to restore the names correctly later. I will post it to cygwin-patches
sometime today, but I'm not sure whether the patch will appear in a Cygwin
snapshot anytime soon, if not I can send you the modified binary and the
patch directly for you to try. The only disadvantage with this method is it
still makes the filenames impossible to type. However, if you have your
terminal set up correctly, it is certainly possible to read them as they
should be (e.g. in xterm with UTF8 support turned on). If you are using a
graphical file browser like konqueror then that makes things even easier.

Regards
Chris




--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Accessing filenames with different charsets
  2002-07-03  3:18       ` Chris January
@ 2002-07-03  3:30         ` Ville Herva
  2002-07-03  3:38           ` Chris January
  2002-07-03  6:24           ` Bernard A Badger
  0 siblings, 2 replies; 15+ messages in thread
From: Ville Herva @ 2002-07-03  3:30 UTC (permalink / raw)
  To: cygwin

On Wed, Jul 03, 2002 at 10:58:38AM +0100, you [Chris January] wrote:
>
> I wrote a patch for Cygwin yesterday that converts Unicode filenames to UTF8
> and back for some file operations. 

Nice!

> This should do what you want and allow you to restore the names correctly
> later. I will post it to cygwin-patches sometime today, but I'm not sure
> whether the patch will appear in a Cygwin snapshot anytime soon, if not I
> can send you the modified binary and the patch directly for you to try.
> The only disadvantage with this method is it still makes the filenames
> impossible to type. However, if you have your terminal set up correctly,
> it is certainly possible to read them as they should be (e.g. in xterm
> with UTF8 support turned on). If you are using a graphical file browser
> like konqueror then that makes things even easier.

Yes, UTF8 approach is propably preferable in all ways. As for typing, I
imagine you can get the 8.3 name with

  for i in *; do echo $i: `cygpath -w -s $i`; done

so you'll be able to type some name for the file as well.

I'm really glad to see this fixed. 

What about filenames longer than MAX_PATH? Those can only be accessed with
"\\.\<path>" and unicode file functions...


-- v --

v@iki.fi

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Accessing filenames with different charsets
  2002-07-03  3:30         ` Ville Herva
@ 2002-07-03  3:38           ` Chris January
  2002-07-03  4:00             ` Ville Herva
  2002-07-03  5:20             ` Re[2]: " Nicholas Wourms
  2002-07-03  6:24           ` Bernard A Badger
  1 sibling, 2 replies; 15+ messages in thread
From: Chris January @ 2002-07-03  3:38 UTC (permalink / raw)
  To: Ville Herva; +Cc: cygwin

> > I wrote a patch for Cygwin yesterday that converts Unicode filenames to
UTF8
> > and back for some file operations.
>
> Nice!
>
> > This should do what you want and allow you to restore the names
correctly
> > later. I will post it to cygwin-patches sometime today, but I'm not sure
> > whether the patch will appear in a Cygwin snapshot anytime soon, if not
I
> > can send you the modified binary and the patch directly for you to try.
> > The only disadvantage with this method is it still makes the filenames
> > impossible to type. However, if you have your terminal set up correctly,
> > it is certainly possible to read them as they should be (e.g. in xterm
> > with UTF8 support turned on). If you are using a graphical file browser
> > like konqueror then that makes things even easier.
>
> Yes, UTF8 approach is propably preferable in all ways. As for typing, I
> imagine you can get the 8.3 name with
>
>   for i in *; do echo $i: `cygpath -w -s $i`; done
>
> so you'll be able to type some name for the file as well.
>
> I'm really glad to see this fixed.
>
> What about filenames longer than MAX_PATH? Those can only be accessed with
> "\\.\<path>" and unicode file functions...
Since most programs internally allocate a buffer of size MAX_PATH or
PATH_MAX, they won't have enough room to store the full filename. Tt would
certainly be possible to support this if a system call was made with a long
filename, but that would mean replacing all statically allocated path
buffers (e.g. char buf[MAX_PATH]) with alloca (e.g. char *buf = alloca
(strlen(inbuf) + margin)) which is more than trivial.

Chris



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Accessing filenames with different charsets
  2002-07-03  3:38           ` Chris January
@ 2002-07-03  4:00             ` Ville Herva
  2002-07-03  5:20             ` Re[2]: " Nicholas Wourms
  1 sibling, 0 replies; 15+ messages in thread
From: Ville Herva @ 2002-07-03  4:00 UTC (permalink / raw)
  To: cygwin

On Wed, Jul 03, 2002 at 11:30:19AM +0100, you [Chris January] wrote:
> Since most programs internally allocate a buffer of size MAX_PATH or
> PATH_MAX, they won't have enough room to store the full filename. 

True.

> Tt would certainly be possible to support this if a system call was made
> with a long filename, 

Problem here is is that you can't get the long filename from readdir() and
friends if the d_name field is limited. So I imagine things like tar and
rsync wouldn't work anyway. 

> but that would mean replacing all statically allocated path buffers (e.g.
> char buf[MAX_PATH]) with alloca (e.g. char
> *buf = alloca (strlen(inbuf) + margin)) which is more than trivial.

Yes, sounds quite tedious.

In this case one could imagine using the 8.3 name... 


-- v --

v@iki.fi

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re[2]: Accessing filenames with different charsets
  2002-07-03  3:38           ` Chris January
  2002-07-03  4:00             ` Ville Herva
@ 2002-07-03  5:20             ` Nicholas Wourms
  2002-07-03  5:29               ` Chris January
  1 sibling, 1 reply; 15+ messages in thread
From: Nicholas Wourms @ 2002-07-03  5:20 UTC (permalink / raw)
  To: Chris January, Ville Herva; +Cc: cygwin

Chris,

Is this going to be an NT-only solution or a global solution.  I only ask
because there exists a redistibutable dll from microsoft which allows for
Unicode support on Win9X/ME.  Any chance that this functionality might be
included in Cygwin?  I suppose the w32api people would have to transform
the .lib file that comes in the SDK to a unixy library, but other then
that, it should be possible.  Then you could tell people who run on such
platforms to go to the site and download the dll if they want the unicode
support.  Or better yet, since redistribution is allowed, have setup.exe
install it on demand.  I'm not too familiar with Unicode in general, so if
your solution can be implimented on Win9X/ME w/o the extra MS Library,
then forgive me for mentioning it.  Otherwise, I hope you will explore
this avenue.

The specific url is:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/win9x/unilayer_4wj7.asp

Cheers,
Nicholas

__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free
http://sbc.yahoo.com

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re[2]: Accessing filenames with different charsets
  2002-07-03  5:20             ` Re[2]: " Nicholas Wourms
@ 2002-07-03  5:29               ` Chris January
  0 siblings, 0 replies; 15+ messages in thread
From: Chris January @ 2002-07-03  5:29 UTC (permalink / raw)
  To: cygwin

> Chris,
>
> Is this going to be an NT-only solution or a global solution.  I only ask
> because there exists a redistibutable dll from microsoft which allows for
> Unicode support on Win9X/ME.  Any chance that this functionality might be
> included in Cygwin?  I suppose the w32api people would have to transform
> the .lib file that comes in the SDK to a unixy library, but other then
> that, it should be possible.  Then you could tell people who run on such
> platforms to go to the site and download the dll if they want the unicode
> support.  Or better yet, since redistribution is allowed, have setup.exe
> install it on demand.  I'm not too familiar with Unicode in general, so if
> your solution can be implimented on Win9X/ME w/o the extra MS Library,
> then forgive me for mentioning it.  Otherwise, I hope you will explore
> this avenue.
>
> The specific url is:
>
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/win9x/unila
yer_4wj7.asp
What my patch basically does is convert Unicode filenames to UTF8 and back
again. If you filesystem doesn't support Unicode filenames, you will have no
use for this patch. This patch may well add Unicode APIs to Win9x/Me, but it
doesn't add Unicode support to VFAT as well, so it won't be much use.
Just to add, there is nothing to stop you using UTF8 filenames under
Win9x/Me. Applications written to support UTF8 should work find under
Win9x/Me. The patch is to allow conversion of said UTF8 filenames to Windows
native Unicode filenames on Windows NT systems and vice-versa.

Chris



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Accessing filenames with different charsets
  2002-07-03  3:30         ` Ville Herva
  2002-07-03  3:38           ` Chris January
@ 2002-07-03  6:24           ` Bernard A Badger
  2002-07-03  6:31             ` Chris January
  1 sibling, 1 reply; 15+ messages in thread
From: Bernard A Badger @ 2002-07-03  6:24 UTC (permalink / raw)
  To: cygwin

> -----Original Message-----
> From: ...Ville Herva
> Sent: Wednesday, July 03, 2002 6:18 AM
> Subject: Re: Accessing filenames with different charsets
...
> What about filenames longer than MAX_PATH? Those can only be accessed with
> "\\.\<path>" and unicode file functions...
> 
What is "\\.\<path>"?  Doesn't work for me:

dir \\.\
The filename, directory name, or volume label syntax is incorrect.


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Accessing filenames with different charsets
  2002-07-03  6:24           ` Bernard A Badger
@ 2002-07-03  6:31             ` Chris January
  2002-07-03  9:08               ` dir \\.\c:\ Bernard A Badger
  0 siblings, 1 reply; 15+ messages in thread
From: Chris January @ 2002-07-03  6:31 UTC (permalink / raw)
  To: cygwin

> > What about filenames longer than MAX_PATH? Those can only be accessed
with
> > "\\.\<path>" and unicode file functions...
> >
> What is "\\.\<path>"?  Doesn't work for me:
>
> dir \\.\
> The filename, directory name, or volume label syntax is incorrect.
OT, but try this instead:
dir \\.\c:\

Chris



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* dir \\.\c:\
  2002-07-03  6:31             ` Chris January
@ 2002-07-03  9:08               ` Bernard A Badger
  2002-07-03 12:51                 ` Ville Herva
  0 siblings, 1 reply; 15+ messages in thread
From: Bernard A Badger @ 2002-07-03  9:08 UTC (permalink / raw)
  To: cygwin

> -----Original Message-----
> From: cygwin-owner@cygwin.com [mailto:cygwin-owner@cygwin.com]On Behalf
> Of Chris January
> Subject: Re: Accessing filenames with different charsets
> 
> > > What about filenames longer than MAX_PATH? Those can only be accessed
> with
> > > "\\.\<path>" and unicode file functions...
> > >
> > What is "\\.\<path>"?  Doesn't work for me:
> >
> > dir \\.\
> > The filename, directory name, or volume label syntax is incorrect.
> OT, but try this instead:
> dir \\.\c:\
> 
> Chris
So what _is_ it?  It doesn't seem like a legitimate UNC name.
It doesn't match \\Host\share\dir\dir2\.
dir \\%COMPUTERNAME%\c:\ doesn't work either.

(Sorry if this is too OT.)



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: dir \\.\c:\
  2002-07-03  9:08               ` dir \\.\c:\ Bernard A Badger
@ 2002-07-03 12:51                 ` Ville Herva
  0 siblings, 0 replies; 15+ messages in thread
From: Ville Herva @ 2002-07-03 12:51 UTC (permalink / raw)
  To: cygwin

On Wed, Jul 03, 2002 at 11:59:51AM -0400, you [Bernard A Badger] wrote:
>
> So what _is_ it?  It doesn't seem like a legitimate UNC name.
> It doesn't match \\Host\share\dir\dir2\.
> dir \\%COMPUTERNAME%\c:\ doesn't work either.
> 
> (Sorry if this is too OT.)

It is a way to access cumbersome filenames such as "aux", "prn", "com",
"lpt1" and filenames with more than MAX_PATH characters in them. (I think
typed the prefix wrong in the previous mail - it should be "\\?\". "\\.\" is
prefix for the NT device names such as "\\.\PhysicalDrive0").

FindFirstFile() documentation:

"Windows NT/2000: In the ANSI version of this function, the name is limited
to MAX_PATH characters. To extend this limit to nearly 32,000 wide
characters, call the Unicode version of the function and prepend "\\?\" to
the path."

and file naming conventions in MSDN:

"The Unicode versions of several functions permit paths that exceed the
MAX_PATH length if the path has the "\\?\" prefix. The "\\?\" tells the
function to turn off path parsing. However, each component in the path
cannot be more than MAX_PATH characters long. Use the "\\?\" prefix with
paths for local storage devices and the "\\?\UNC\" prefix with paths having
the Universal Naming Convention (UNC) format. The "\\?\" is ignored as part
of the path. For example, "\\?\C:\myworld\private" is seen as
"C:\myworld\private", and "\\?\UNC\bill_g_1\hotstuff\coolapps" is seen as
"\\bill_g_1\hotstuff\coolapps". "

(Yes, the bill_g example is actually Microsoft's :)

In addition, you can propably find a KB article talking about nasty
filenames ("com", "aux" etc) and how to get rid of them. Since "\\?\" prefix
turns of path parsing, you can access them with the prefix.


-- v --

v@iki.fi

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2002-07-03 19:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-01  1:59 Accessing filenames with different charsets Ville Herva
2002-07-02 14:50 ` Ville Herva
2002-07-02 14:59   ` Chris January
2002-07-02 15:05     ` Ville Herva
2002-07-03  2:48     ` Ville Herva
2002-07-03  3:18       ` Chris January
2002-07-03  3:30         ` Ville Herva
2002-07-03  3:38           ` Chris January
2002-07-03  4:00             ` Ville Herva
2002-07-03  5:20             ` Re[2]: " Nicholas Wourms
2002-07-03  5:29               ` Chris January
2002-07-03  6:24           ` Bernard A Badger
2002-07-03  6:31             ` Chris January
2002-07-03  9:08               ` dir \\.\c:\ Bernard A Badger
2002-07-03 12:51                 ` Ville Herva

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).