public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca>
To: cygwin@cygwin.com
Subject: Re: bug in cygstart utility
Date: Mon, 1 Mar 2021 08:06:11 -0700	[thread overview]
Message-ID: <b21f0310-29b0-c7b2-fa72-f00326fd93a7@SystematicSw.ab.ca> (raw)
In-Reply-To: <PA4PR03MB69437E82C15ECDD7203A8506FB9A9@PA4PR03MB6943.eurprd03.prod.outlook.com>

On 2021-03-01 04:17, John Vincent via Cygwin wrote:
> I'm running cygwin on Windows 10, using UTF8 in English. I run cygwin bash 
> inside a cygwin mintty terminal. I've noticed a minor problem when using 
> cygstart with wildcard parameters.
> I type:
>	$ cygstart *.??p
> If there is a matching file then everything works as I expect. However if
> there is no matching file I get an error message as follows:
> Unable to start '.p': The specified file was not found.
> When I look at this using the "od" command I see the following:
> $ cygstart *.??p 2>&1 | od -tx1 -c
> 0000000  55  6e  61  62  6c  65  20  74  6f  20  73  74  61  72  74  20
>           U   n   a   b   l   e       t   o       s   t   a   r   t
> 0000020  27  ef  80  aa  2e  ef  80  bf  ef  80  bf  70  27  3a  20  54
>           ' 357 200 252   . 357 200 277 357 200 277   p   '   :       T
> 0000040  68  65  20  73  70  65  63  69  66  69  65  64  20  66  69  6c
>           h   e       s   p   e   c   i   f   i   e   d       f   i   l
> 0000060  65  20  77  61  73  20  6e  6f  74  20  66  6f  75  6e  64  2e
>           e       w   a   s       n   o   t       f   o   u   n   d   .
> 0000100  0a
>          \n
> It looks to me like cygstart is not outputting the correct UTF-8 for either
> the * character or the ? character. I think this is a bug.
To support POSIX path names, Cygwin allows any characters other than \0 and /, 
so it maps Windows special characters into the UTF-8 BMP PUA:

https://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-specialchars

http://www.unicode.org/faq/private_use.html

https://en.wikipedia.org/wiki/Private_Use_Areas

It may also prefix unsupported codes in a code page with CAN/0x18.

The bug is in displaying in the error message the remapped string with 
undisplayable PUA characters, rather than either the reverse mapped string or 
the original input path name.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]

  parent reply	other threads:[~2021-03-01 15:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-01 11:17 John Vincent
2021-03-01 12:23 ` Eliot Moss
2021-03-01 15:06 ` Brian Inglis [this message]
2021-03-02  4:27   ` cygutils cygstart displays PUA code points in messages when wild cards not found Brian Inglis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b21f0310-29b0-c7b2-fa72-f00326fd93a7@SystematicSw.ab.ca \
    --to=brian.inglis@systematicsw.ab.ca \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).