public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Cygwin fails to utilize Unicode replacement character
@ 2018-09-01 16:13 Steven Penny
  2018-09-01 18:11 ` Thomas Wolff
                   ` (3 more replies)
  0 siblings, 4 replies; 62+ messages in thread
From: Steven Penny @ 2018-09-01 16:13 UTC (permalink / raw)
  To: cygwin

Using this file:

    $ printf '\353\n' > alfa.txt

    $ iconv -f CP1252 alfa.txt
    ë

You get this result with Linux:

    $ cat alfa.txt
    �

Where "cat" properly outputs Unicode 'REPLACEMENT CHARACTER' (U+FFFD). However
with Cygwin you get this:

    $ cat alfa.txt
    â–’

Where "cat" outputs Unicode Character 'MEDIUM SHADE' (U+2592).


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-01 16:13 Cygwin fails to utilize Unicode replacement character Steven Penny
@ 2018-09-01 18:11 ` Thomas Wolff
  2018-09-01 18:46   ` Steven Penny
  2018-09-01 19:40 ` Corinna Vinschen
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-01 18:11 UTC (permalink / raw)
  To: cygwin

Am 01.09.2018 um 18:13 schrieb Steven Penny:
> Using this file:
>
>    $ printf '\353\n' > alfa.txt
>
>    $ iconv -f CP1252 alfa.txt
>    ë
>
> You get this result with Linux:
>
>    $ cat alfa.txt
>    �
>
> Where "cat" properly outputs Unicode 'REPLACEMENT CHARACTER' (U+FFFD). 
> However
> with Cygwin you get this:
>
>    $ cat alfa.txt
>    ▒
>
> Where "cat" outputs Unicode Character 'MEDIUM SHADE' (U+2592).
Which terminals are used and what's the output of `locale` and `cat 
--version` in both cases?

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-01 18:11 ` Thomas Wolff
@ 2018-09-01 18:46   ` Steven Penny
  2018-09-01 21:07     ` Thomas Wolff
  0 siblings, 1 reply; 62+ messages in thread
From: Steven Penny @ 2018-09-01 18:46 UTC (permalink / raw)
  To: cygwin

On Sat, 1 Sep 2018 20:11:15, Thomas Wolff wrote:
> Which terminals are used and what's the output of `locale` and `cat 
> --version` in both cases?

Linux:

    $ echo "$TERM"
    xterm-256color

    $ locale
    LANG=en_US.UTF-8
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE=C
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER="en_US.UTF-8"
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"
    LC_ALL=

    $ cat --version
    cat (GNU coreutils) 8.29

Cygwin:

    $ echo "$TERM"
    cygwin

    $ locale
    LANG=en_US.UTF-8
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE="C"
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_ALL=

    $ cat --version
    cat (GNU coreutils) 8.26

Note that in addition to Linux, Windows PowerShell also gives correct output:

    $ pwsh -c '[system.text.encoding]::UTF8.getString(0xEB)'
    �

compare again with Cygwin:

    $ printf '\xEB'
    â–’


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-01 16:13 Cygwin fails to utilize Unicode replacement character Steven Penny
  2018-09-01 18:11 ` Thomas Wolff
@ 2018-09-01 19:40 ` Corinna Vinschen
  2018-09-01 21:50 ` Doug Henderson
  2018-09-04 19:59 ` Doug Henderson
  3 siblings, 0 replies; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-01 19:40 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 711 bytes --]

On Sep  1 09:13, Steven Penny wrote:
> Using this file:
> 
>    $ printf '\353\n' > alfa.txt
> 
>    $ iconv -f CP1252 alfa.txt
>    ë
> 
> You get this result with Linux:
> 
>    $ cat alfa.txt
>    �
> 
> Where "cat" properly outputs Unicode 'REPLACEMENT CHARACTER' (U+FFFD). However
> with Cygwin you get this:
> 
>    $ cat alfa.txt
>    ▒
> 
> Where "cat" outputs Unicode Character 'MEDIUM SHADE' (U+2592).

I changed it in Cygwin for the console window.  For mintty this would
have to be changed in mintty itself.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-01 18:46   ` Steven Penny
@ 2018-09-01 21:07     ` Thomas Wolff
  0 siblings, 0 replies; 62+ messages in thread
From: Thomas Wolff @ 2018-09-01 21:07 UTC (permalink / raw)
  To: cygwin

Am 01.09.2018 um 20:46 schrieb Steven Penny:
> On Sat, 1 Sep 2018 20:11:15, Thomas Wolff wrote:
>> Which terminals are used and what's the output of `locale` and `cat 
>> --version` in both cases?
>
> ...
>
> Note that in addition to Linux, Windows PowerShell also gives correct 
> output:
>
>    $ pwsh -c '[system.text.encoding]::UTF8.getString(0xEB)'
>    �
What makes you claim this would be the "correct output"? Where is this 
defined?

> compare again with Cygwin:
>
>    $ printf '\xEB'
>    ▒
Actually, in mintty, this is not (anymore) the MEDIUM SHADE. Please compare.
There's also a problem with using MEDIUM SHADE. In an ambiguous-width 
locale (or explicit ambiguous-width terminal mode), that character has 
double-width and is therefore not suitable as a replacement for a single 
illegal UTF-8 byte.
Cygwin console does not support double-width so it does not have this 
problem, but until further clarification I think I'll not change it in 
mintty.
Thomas

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-01 16:13 Cygwin fails to utilize Unicode replacement character Steven Penny
  2018-09-01 18:11 ` Thomas Wolff
  2018-09-01 19:40 ` Corinna Vinschen
@ 2018-09-01 21:50 ` Doug Henderson
  2018-09-01 22:49   ` Steven Penny
  2018-09-04 19:59 ` Doug Henderson
  3 siblings, 1 reply; 62+ messages in thread
From: Doug Henderson @ 2018-09-01 21:50 UTC (permalink / raw)
  To: cygwin

On Sat, 1 Sep 2018 at 10:13, Steven Penny  wrote:
...
> You get this result with Linux:
>
>     $ cat alfa.txt
>     �
...
> with Cygwin you get this:
>
>     $ cat alfa.txt
>     ▒
...
This is an issue with rendering the character in the terminal window.
In both the CMD/Conhost/bash and Mintty/bash terminals, I have
configure the font to be Lucinda Console. This font does not have a
glyph for U+FFFD: Replacement Character. (To check your character set,
open Charmap, and check Advanced View. Type "Replacement Character" on
the Search field, and search.) In the absence of that glyph, the
terminal program must choose a glyph to display. In a later reply,
Thomas Wolff, the maintainer of Mintty, indicates that Mintty displays
the glyph for U+2592: Medium Shade (or a similar one). Without
reference to the source, it is difficult to be certain, but Conhost
appears to use a similar glyph.

In Mintty, if you choose a font, such as DejaVu Sans Mono, which
contains a glyph for U+FFFD: Replacement Character, you could expect
to see that glyph, however that is determined by the terminal. As I
write this, both Mintty (2.9.0) and Conhost (Windows 10 Home,
10.0.17134 Build 17134, fully patched) display a glyph with the
appearance of U+2592 Medium Shade.

So, IMHO, to provide a similar visual, across all fonts and terminals,
these programs need to display a glyph common to all, such as the
Medium Shade.

HTH,
Doug

BTW, in my Debian 9.5 VM, the Replacement Character is displayed. The
Characters app shows that the default font contains the Replacement
Character, and that is what is displayed in the terminal.
-- 
Doug Henderson, Calgary, Alberta, Canada - from gmail.com

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-01 21:50 ` Doug Henderson
@ 2018-09-01 22:49   ` Steven Penny
  2018-09-02  8:07     ` Thomas Wolff
  0 siblings, 1 reply; 62+ messages in thread
From: Steven Penny @ 2018-09-01 22:49 UTC (permalink / raw)
  To: cygwin

On Sat, 1 Sep 2018 15:50:04, Doug Henderson wrote:
> This is an issue with rendering the character in the terminal window.
> In both the CMD/Conhost/bash and Mintty/bash terminals, I have
> configure the font to be Lucinda Console. This font does not have a
> glyph for U+FFFD: Replacement Character. (To check your character set,
> open Charmap, and check Advanced View. Type "Replacement Character" on
> the Search field, and search.) In the absence of that glyph, the
> terminal program must choose a glyph to display. In a later reply,
> Thomas Wolff, the maintainer of Mintty, indicates that Mintty displays
> the glyph for U+2592: Medium Shade (or a similar one). Without
> reference to the source, it is difficult to be certain, but Conhost
> appears to use a similar glyph.
>
> In Mintty, if you choose a font, such as DejaVu Sans Mono, which
> contains a glyph for U+FFFD: Replacement Character, you could expect
> to see that glyph, however that is determined by the terminal. As I
> write this, both Mintty (2.9.0) and Conhost (Windows 10 Home,
> 10.0.17134 Build 17134, fully patched) display a glyph with the
> appearance of U+2592 Medium Shade.

Hm, this is a tough call. These fonts come with Windows:

- Consolas
- Lucida Console

Neither of them provide U+FFFD, so that means it will fall back to the
".notdef glyph":

http://docs.microsoft.com/typography/opentype/spec/recom

That presents 2 options:

U+FFFD:
  unicode conformant:
    yes
  consolas or lucida console:
    invalid byte or missing char: same glyph

U+2592:
  unicode conformant:
    no
  consolas or lucida console:
    invalid byte or missing char: different glyph
    
I would prefer the first option - as other fonts do define U+FFFD, including
"DejaVu Sans Mono" which Cygwin provides via the "dejavu-fonts" package. However
I can understand if we wanted to side with people using one of the default
fonts.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-01 22:49   ` Steven Penny
@ 2018-09-02  8:07     ` Thomas Wolff
  2018-09-02 12:51       ` Steven Penny
  0 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-02  8:07 UTC (permalink / raw)
  To: cygwin

Am 02.09.2018 um 00:49 schrieb Steven Penny:
> On Sat, 1 Sep 2018 15:50:04, Doug Henderson wrote:
>> This is an issue with rendering the character in the terminal window.
>> In both the CMD/Conhost/bash and Mintty/bash terminals, I have
>> configure the font to be Lucinda Console. This font does not have a
>> glyph for U+FFFD: Replacement Character. (To check your character set,
>> open Charmap, and check Advanced View. Type "Replacement Character" on
>> the Search field, and search.) In the absence of that glyph, the
>> terminal program must choose a glyph to display. In a later reply,
>> Thomas Wolff, the maintainer of Mintty, indicates that Mintty displays
>> the glyph for U+2592: Medium Shade (or a similar one). Without
>> reference to the source, it is difficult to be certain, but Conhost
>> appears to use a similar glyph.
>>
>> In Mintty, if you choose a font, such as DejaVu Sans Mono, which
>> contains a glyph for U+FFFD: Replacement Character, you could expect
>> to see that glyph, however that is determined by the terminal. As I
>> write this, both Mintty (2.9.0) and Conhost (Windows 10 Home,
>> 10.0.17134 Build 17134, fully patched) display a glyph with the
>> appearance of U+2592 Medium Shade.
>
> Hm, this is a tough call. These fonts come with Windows:
>
> - Consolas
> - Lucida Console
>
> Neither of them provide U+FFFD, so that means it will fall back to the
> ".notdef glyph":
>
> http://docs.microsoft.com/typography/opentype/spec/recom
>
> That presents 2 options:
>
> U+FFFD:
>  unicode conformant:
>    yes
>  consolas or lucida console:
>    invalid byte or missing char: same glyph
>
> U+2592:
>  unicode conformant:
>    no
>  consolas or lucida console:
>    invalid byte or missing char: different glyph
>    I would prefer the first option - as other fonts do define U+FFFD, 
> including
> "DejaVu Sans Mono" which Cygwin provides via the "dejavu-fonts" 
> package. However
> I can understand if we wanted to side with people using one of the 
> default fonts.
Actually, the width problem I suggested in my other response (and even 
referring to the wrong character) does not apply as mintty enforces 
proper width in that case.
Also, even with fonts that do not provide the glyph, you will usually 
still see it by the Windows font fallback mechanism.
Shall I make it configurable?
Thomas

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-02  8:07     ` Thomas Wolff
@ 2018-09-02 12:51       ` Steven Penny
  2018-09-03 12:46         ` Corinna Vinschen
  2018-09-03 16:05         ` Brian Inglis
  0 siblings, 2 replies; 62+ messages in thread
From: Steven Penny @ 2018-09-02 12:51 UTC (permalink / raw)
  To: cygwin

On Sun, 2 Sep 2018 10:07:10, Thomas Wolff wrote:
> Actually, the width problem I suggested in my other response (and even 
> referring to the wrong character) does not apply as mintty enforces 
> proper width in that case.
> Also, even with fonts that do not provide the glyph, you will usually 
> still see it by the Windows font fallback mechanism.
> Shall I make it configurable?

your call - here are the possible resolutions - in order of my preference:

1. Change the default to U+FFFD with no option
2. Change the default to U+FFFD with option to change
3. Leave default as is with option to change


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-02 12:51       ` Steven Penny
@ 2018-09-03 12:46         ` Corinna Vinschen
  2018-09-03 14:59           ` Corinna Vinschen
  2018-09-03 16:05         ` Brian Inglis
  1 sibling, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-03 12:46 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]

On Sep  2 05:51, Steven Penny wrote:
> On Sun, 2 Sep 2018 10:07:10, Thomas Wolff wrote:
> > Actually, the width problem I suggested in my other response (and even
> > referring to the wrong character) does not apply as mintty enforces
> > proper width in that case.
> > Also, even with fonts that do not provide the glyph, you will usually
> > still see it by the Windows font fallback mechanism.
> > Shall I make it configurable?
> 
> your call - here are the possible resolutions - in order of my preference:
> 
> 1. Change the default to U+FFFD with no option
> 2. Change the default to U+FFFD with option to change
> 3. Leave default as is with option to change

Ideally we could check if the current font supports a visual
representation of 0xfffd and if not, fall back to 0x2592.

Not sure how feasible that is, but it doesn't seem to be overly
complicated.  I'm just looking into a solution for the Cygwin
console.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 12:46         ` Corinna Vinschen
@ 2018-09-03 14:59           ` Corinna Vinschen
  2018-09-03 16:34             ` Thomas Wolff
  0 siblings, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-03 14:59 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2531 bytes --]

On Sep  3 14:46, Corinna Vinschen wrote:
> On Sep  2 05:51, Steven Penny wrote:
> > On Sun, 2 Sep 2018 10:07:10, Thomas Wolff wrote:
> > > Actually, the width problem I suggested in my other response (and even
> > > referring to the wrong character) does not apply as mintty enforces
> > > proper width in that case.
> > > Also, even with fonts that do not provide the glyph, you will usually
> > > still see it by the Windows font fallback mechanism.
> > > Shall I make it configurable?
> > 
> > your call - here are the possible resolutions - in order of my preference:
> > 
> > 1. Change the default to U+FFFD with no option
> > 2. Change the default to U+FFFD with option to change
> > 3. Leave default as is with option to change
> 
> Ideally we could check if the current font supports a visual
> representation of 0xfffd and if not, fall back to 0x2592.
> 
> Not sure how feasible that is, but it doesn't seem to be overly
> complicated.  I'm just looking into a solution for the Cygwin
> console.

Only, I can't get this working.  In theory the GDI function
GetGlyphIndicesW is supposed to allow checking if a certain character
exists.  But I'm getting a weird result.  This code:

  static const wchar_t replacement_char[2] =
    {
      0xfffd, /* REPLACEMENT CHARACTER */
      0x2592  /* MEDIUM SHADE */
    };
  HWND cwnd = GetConsoleWindow ();
  HDC cdc = GetDC (cwnd);
  int rp_idx = 0;
  WORD gi = 0;
  DWORD ret = GetGlyphIndicesW (cdc, replacement_char, 1, &gi,
                                GGI_MARK_NONEXISTING_GLYPHS);
  if (ret != GDI_ERROR && gi == 0xffff)
    rp_idx = 1;

always sets rp_idx to 1 when called from inside the Cygwin DLL,
independently of the actual console font.  And, here's the really weird
thing, it always sets rp_idx to 0 when called directly from an
application, likewise independently of the actual console font.

Does anybody have an idea what I'm doing wrong?

Just as side-note:

- GetTextFaceW always returns font number 7 called "System", independently
  of the actual current font set in the console.
- GetCurrentConsoleFont always returns a font number of 0, independently
  of the actual current font set in the console.
  GetCurrentConsoleFontEx always returns with error 87, "invalid
  parameter"

Something's very fishy.  Thanks for any actual help.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Cygwin fails to utilize Unicode replacement character
  2018-09-02 12:51       ` Steven Penny
  2018-09-03 12:46         ` Corinna Vinschen
@ 2018-09-03 16:05         ` Brian Inglis
  1 sibling, 0 replies; 62+ messages in thread
From: Brian Inglis @ 2018-09-03 16:05 UTC (permalink / raw)
  To: cygwin

On 2018-09-02 06:51, Steven Penny wrote:
> On Sun, 2 Sep 2018 10:07:10, Thomas Wolff wrote:
>> Actually, the width problem I suggested in my other response (and even 
>> referring to the wrong character) does not apply as mintty enforces proper 
>> width in that case.
>> Also, even with fonts that do not provide the glyph, you will usually
>> still see it by the Windows font fallback mechanism.
>> Shall I make it configurable?
> your call - here are the possible resolutions - in order of my preference:

What about TTF .notdef glyph index 0?

> 1. Change the default to U+FFFD with no option
> 2. Change the default to U+FFFD with option to change
> 3. Leave default as is with option to change

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 14:59           ` Corinna Vinschen
@ 2018-09-03 16:34             ` Thomas Wolff
  2018-09-03 17:17               ` Corinna Vinschen
  0 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-03 16:34 UTC (permalink / raw)
  To: cygwin

Am 03.09.2018 um 16:59 schrieb Corinna Vinschen:
> On Sep  3 14:46, Corinna Vinschen wrote:
>> On Sep  2 05:51, Steven Penny wrote:
>>> On Sun, 2 Sep 2018 10:07:10, Thomas Wolff wrote:
>>>> Actually, the width problem I suggested in my other response (and even
>>>> referring to the wrong character) does not apply as mintty enforces
>>>> proper width in that case.
>>>> Also, even with fonts that do not provide the glyph, you will usually
>>>> still see it by the Windows font fallback mechanism.
>>>> Shall I make it configurable?
>>> your call - here are the possible resolutions - in order of my preference:
>>>
>>> 1. Change the default to U+FFFD with no option
>>> 2. Change the default to U+FFFD with option to change
>>> 3. Leave default as is with option to change
>> Ideally we could check if the current font supports a visual
>> representation of 0xfffd and if not, fall back to 0x2592.
>>
>> Not sure how feasible that is, but it doesn't seem to be overly
>> complicated.  I'm just looking into a solution for the Cygwin
>> console.
> Only, I can't get this working.  In theory the GDI function
> GetGlyphIndicesW is supposed to allow checking if a certain character
> exists.  But I'm getting a weird result.  This code:
>
>    static const wchar_t replacement_char[2] =
>      {
>        0xfffd, /* REPLACEMENT CHARACTER */
>        0x2592  /* MEDIUM SHADE */
>      };
>    HWND cwnd = GetConsoleWindow ();
>    HDC cdc = GetDC (cwnd);
>    int rp_idx = 0;
>    WORD gi = 0;
>    DWORD ret = GetGlyphIndicesW (cdc, replacement_char, 1, &gi,
>                                  GGI_MARK_NONEXISTING_GLYPHS);
>    if (ret != GDI_ERROR && gi == 0xffff)
>      rp_idx = 1;
>
> always sets rp_idx to 1 when called from inside the Cygwin DLL,
> independently of the actual console font.  And, here's the really weird
> thing, it always sets rp_idx to 0 when called directly from an
> application, likewise independently of the actual console font.
>
> Does anybody have an idea what I'm doing wrong?
This works in mintty, just uploaded a patch. Maybe somehow the 
GetConsole "dc" does not support this usage?

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 16:34             ` Thomas Wolff
@ 2018-09-03 17:17               ` Corinna Vinschen
  2018-09-03 17:56                 ` Thomas Wolff
  0 siblings, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-03 17:17 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2553 bytes --]

On Sep  3 18:34, Thomas Wolff wrote:
> Am 03.09.2018 um 16:59 schrieb Corinna Vinschen:
> > On Sep  3 14:46, Corinna Vinschen wrote:
> > > On Sep  2 05:51, Steven Penny wrote:
> > > > On Sun, 2 Sep 2018 10:07:10, Thomas Wolff wrote:
> > > > > Actually, the width problem I suggested in my other response (and even
> > > > > referring to the wrong character) does not apply as mintty enforces
> > > > > proper width in that case.
> > > > > Also, even with fonts that do not provide the glyph, you will usually
> > > > > still see it by the Windows font fallback mechanism.
> > > > > Shall I make it configurable?
> > > > your call - here are the possible resolutions - in order of my preference:
> > > > 
> > > > 1. Change the default to U+FFFD with no option
> > > > 2. Change the default to U+FFFD with option to change
> > > > 3. Leave default as is with option to change
> > > Ideally we could check if the current font supports a visual
> > > representation of 0xfffd and if not, fall back to 0x2592.
> > > 
> > > Not sure how feasible that is, but it doesn't seem to be overly
> > > complicated.  I'm just looking into a solution for the Cygwin
> > > console.
> > Only, I can't get this working.  In theory the GDI function
> > GetGlyphIndicesW is supposed to allow checking if a certain character
> > exists.  But I'm getting a weird result.  This code:
> > 
> >    static const wchar_t replacement_char[2] =
> >      {
> >        0xfffd, /* REPLACEMENT CHARACTER */
> >        0x2592  /* MEDIUM SHADE */
> >      };
> >    HWND cwnd = GetConsoleWindow ();
> >    HDC cdc = GetDC (cwnd);
> >    int rp_idx = 0;
> >    WORD gi = 0;
> >    DWORD ret = GetGlyphIndicesW (cdc, replacement_char, 1, &gi,
> >                                  GGI_MARK_NONEXISTING_GLYPHS);
> >    if (ret != GDI_ERROR && gi == 0xffff)
> >      rp_idx = 1;
> > 
> > always sets rp_idx to 1 when called from inside the Cygwin DLL,
> > independently of the actual console font.  And, here's the really weird
> > thing, it always sets rp_idx to 0 when called directly from an
> > application, likewise independently of the actual console font.
> > 
> > Does anybody have an idea what I'm doing wrong?
> This works in mintty, just uploaded a patch. Maybe somehow the GetConsole
> "dc" does not support this usage?

¯\_(ツ)_/¯

But I'm glad it works for you.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 17:17               ` Corinna Vinschen
@ 2018-09-03 17:56                 ` Thomas Wolff
  2018-09-03 18:20                   ` Thomas Wolff
  0 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-03 17:56 UTC (permalink / raw)
  To: cygwin

Am 03.09.2018 um 19:16 schrieb Corinna Vinschen:
> On Sep  3 18:34, Thomas Wolff wrote:
>> Am 03.09.2018 um 16:59 schrieb Corinna Vinschen:
>>> On Sep  3 14:46, Corinna Vinschen wrote:
>>>> On Sep  2 05:51, Steven Penny wrote:
>>>>> On Sun, 2 Sep 2018 10:07:10, Thomas Wolff wrote:
>>>>>> Actually, the width problem I suggested in my other response (and even
>>>>>> referring to the wrong character) does not apply as mintty enforces
>>>>>> proper width in that case.
>>>>>> Also, even with fonts that do not provide the glyph, you will usually
>>>>>> still see it by the Windows font fallback mechanism.
>>>>>> Shall I make it configurable?
>>>>> your call - here are the possible resolutions - in order of my preference:
>>>>>
>>>>> 1. Change the default to U+FFFD with no option
>>>>> 2. Change the default to U+FFFD with option to change
>>>>> 3. Leave default as is with option to change
>>>> Ideally we could check if the current font supports a visual
>>>> representation of 0xfffd and if not, fall back to 0x2592.
>>>>
>>>> Not sure how feasible that is, but it doesn't seem to be overly
>>>> complicated.  I'm just looking into a solution for the Cygwin
>>>> console.
>>> Only, I can't get this working.  In theory the GDI function
>>> GetGlyphIndicesW is supposed to allow checking if a certain character
>>> exists.  But I'm getting a weird result.  This code:
>>>
>>>     static const wchar_t replacement_char[2] =
>>>       {
>>>         0xfffd, /* REPLACEMENT CHARACTER */
>>>         0x2592  /* MEDIUM SHADE */
>>>       };
>>>     HWND cwnd = GetConsoleWindow ();
>>>     HDC cdc = GetDC (cwnd);
>>>     int rp_idx = 0;
>>>     WORD gi = 0;
>>>     DWORD ret = GetGlyphIndicesW (cdc, replacement_char, 1, &gi,
>>>                                   GGI_MARK_NONEXISTING_GLYPHS);
>>>     if (ret != GDI_ERROR && gi == 0xffff)
>>>       rp_idx = 1;
>>>
>>> always sets rp_idx to 1 when called from inside the Cygwin DLL,
>>> independently of the actual console font.  And, here's the really weird
>>> thing, it always sets rp_idx to 0 when called directly from an
>>> application, likewise independently of the actual console font.
>>>
>>> Does anybody have an idea what I'm doing wrong?
>> This works in mintty, just uploaded a patch. Maybe somehow the GetConsole
>> "dc" does not support this usage?
> ¯\_(ツ)_/¯
Dito; hold on, sorry, your code does *not* work inside mintty.
Mine looks a bit different and I thought to have manually verified it's 
functionally equivalent, but indeed there must be something fishy...

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 17:56                 ` Thomas Wolff
@ 2018-09-03 18:20                   ` Thomas Wolff
  2018-09-03 19:14                     ` Corinna Vinschen
  0 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-03 18:20 UTC (permalink / raw)
  To: cygwin

Am 03.09.2018 um 19:56 schrieb Thomas Wolff:
> Am 03.09.2018 um 19:16 schrieb Corinna Vinschen:
>> On Sep  3 18:34, Thomas Wolff wrote:
>>> Am 03.09.2018 um 16:59 schrieb Corinna Vinschen:
>>>> On Sep  3 14:46, Corinna Vinschen wrote:
>>>>> On Sep  2 05:51, Steven Penny wrote:
>>>>>> On Sun, 2 Sep 2018 10:07:10, Thomas Wolff wrote:
>>>>>>> Actually, the width problem I suggested in my other response 
>>>>>>> (and even
>>>>>>> referring to the wrong character) does not apply as mintty enforces
>>>>>>> proper width in that case.
>>>>>>> Also, even with fonts that do not provide the glyph, you will 
>>>>>>> usually
>>>>>>> still see it by the Windows font fallback mechanism.
>>>>>>> Shall I make it configurable?
>>>>>> your call - here are the possible resolutions - in order of my 
>>>>>> preference:
>>>>>>
>>>>>> 1. Change the default to U+FFFD with no option
>>>>>> 2. Change the default to U+FFFD with option to change
>>>>>> 3. Leave default as is with option to change
>>>>> Ideally we could check if the current font supports a visual
>>>>> representation of 0xfffd and if not, fall back to 0x2592.
>>>>>
>>>>> Not sure how feasible that is, but it doesn't seem to be overly
>>>>> complicated.  I'm just looking into a solution for the Cygwin
>>>>> console.
>>>> Only, I can't get this working.  In theory the GDI function
>>>> GetGlyphIndicesW is supposed to allow checking if a certain character
>>>> exists.  But I'm getting a weird result.  This code:
>>>>
>>>>     static const wchar_t replacement_char[2] =
>>>>       {
>>>>         0xfffd, /* REPLACEMENT CHARACTER */
>>>>         0x2592  /* MEDIUM SHADE */
>>>>       };
>>>>     HWND cwnd = GetConsoleWindow ();
>>>>     HDC cdc = GetDC (cwnd);
>>>>     int rp_idx = 0;
>>>>     WORD gi = 0;
>>>>     DWORD ret = GetGlyphIndicesW (cdc, replacement_char, 1, &gi,
>>>> GGI_MARK_NONEXISTING_GLYPHS);
>>>>     if (ret != GDI_ERROR && gi == 0xffff)
>>>>       rp_idx = 1;
>>>>
>>>> always sets rp_idx to 1 when called from inside the Cygwin DLL,
>>>> independently of the actual console font.  And, here's the really 
>>>> weird
>>>> thing, it always sets rp_idx to 0 when called directly from an
>>>> application, likewise independently of the actual console font.
>>>>
>>>> Does anybody have an idea what I'm doing wrong?
>>> This works in mintty, just uploaded a patch. Maybe somehow the 
>>> GetConsole
>>> "dc" does not support this usage?
>> ¯\_(ツ)_/¯
> Dito; hold on, sorry, your code does *not* work inside mintty.
> Mine looks a bit different and I thought to have manually verified 
> it's functionally equivalent, but indeed there must be something fishy...
You still need to
   SelectObject(cdc, f);
where f is the HFONT of the font you want to check.
To compare, you may check out function win_check_glyphs in file 
wintext.c in mintty.
Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 18:20                   ` Thomas Wolff
@ 2018-09-03 19:14                     ` Corinna Vinschen
  2018-09-03 20:27                       ` Corinna Vinschen
  0 siblings, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-03 19:14 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 4122 bytes --]

On Sep  3 20:20, Thomas Wolff wrote:
> Am 03.09.2018 um 19:56 schrieb Thomas Wolff:
> > Am 03.09.2018 um 19:16 schrieb Corinna Vinschen:
> > > On Sep  3 18:34, Thomas Wolff wrote:
> > > > Am 03.09.2018 um 16:59 schrieb Corinna Vinschen:
> > > > > On Sep  3 14:46, Corinna Vinschen wrote:
> > > > > > On Sep  2 05:51, Steven Penny wrote:
> > > > > > > On Sun, 2 Sep 2018 10:07:10, Thomas Wolff wrote:
> > > > > > > > Actually, the width problem I suggested in my
> > > > > > > > other response (and even
> > > > > > > > referring to the wrong character) does not apply as mintty enforces
> > > > > > > > proper width in that case.
> > > > > > > > Also, even with fonts that do not provide the
> > > > > > > > glyph, you will usually
> > > > > > > > still see it by the Windows font fallback mechanism.
> > > > > > > > Shall I make it configurable?
> > > > > > > your call - here are the possible resolutions - in
> > > > > > > order of my preference:
> > > > > > > 
> > > > > > > 1. Change the default to U+FFFD with no option
> > > > > > > 2. Change the default to U+FFFD with option to change
> > > > > > > 3. Leave default as is with option to change
> > > > > > Ideally we could check if the current font supports a visual
> > > > > > representation of 0xfffd and if not, fall back to 0x2592.
> > > > > > 
> > > > > > Not sure how feasible that is, but it doesn't seem to be overly
> > > > > > complicated.  I'm just looking into a solution for the Cygwin
> > > > > > console.
> > > > > Only, I can't get this working.  In theory the GDI function
> > > > > GetGlyphIndicesW is supposed to allow checking if a certain character
> > > > > exists.  But I'm getting a weird result.  This code:
> > > > > 
> > > > >     static const wchar_t replacement_char[2] =
> > > > >       {
> > > > >         0xfffd, /* REPLACEMENT CHARACTER */
> > > > >         0x2592  /* MEDIUM SHADE */
> > > > >       };
> > > > >     HWND cwnd = GetConsoleWindow ();
> > > > >     HDC cdc = GetDC (cwnd);
> > > > >     int rp_idx = 0;
> > > > >     WORD gi = 0;
> > > > >     DWORD ret = GetGlyphIndicesW (cdc, replacement_char, 1, &gi,
> > > > > GGI_MARK_NONEXISTING_GLYPHS);
> > > > >     if (ret != GDI_ERROR && gi == 0xffff)
> > > > >       rp_idx = 1;
> > > > > 
> > > > > always sets rp_idx to 1 when called from inside the Cygwin DLL,
> > > > > independently of the actual console font.  And, here's the
> > > > > really weird
> > > > > thing, it always sets rp_idx to 0 when called directly from an
> > > > > application, likewise independently of the actual console font.
> > > > > 
> > > > > Does anybody have an idea what I'm doing wrong?
> > > > This works in mintty, just uploaded a patch. Maybe somehow the
> > > > GetConsole
> > > > "dc" does not support this usage?
> > > ¯\_(ツ)_/¯
> > Dito; hold on, sorry, your code does *not* work inside mintty.
> > Mine looks a bit different and I thought to have manually verified it's
> > functionally equivalent, but indeed there must be something fishy...
> You still need to
>   SelectObject(cdc, f);
> where f is the HFONT of the font you want to check.
> To compare, you may check out function win_check_glyphs in file wintext.c in
> mintty.

Thanks but I don't know how to get a HFONT for the current console font.

In the meantime I figured out why my GetCurrentConsoleFontEx call
failed with error 87:

When looking again I realized there's a member called cbSize.  The MSDN
docs neglect to tell that the cbSize member has to be primed with
sizeof(CONSOLE_FONT_INFOEX).  As soon as I tried that, the function
succeeded.

Well, it's a start.  I now have the actual font name.  No idea how to
get a HFONT from there, though.  From what I can tell ATM, I'd have to
call CreateFont to get a new HFONT and then destroy it again after
usage.  This looks pretty wasteful.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 19:14                     ` Corinna Vinschen
@ 2018-09-03 20:27                       ` Corinna Vinschen
  2018-09-03 20:42                         ` Thomas Wolff
  0 siblings, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-03 20:27 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3547 bytes --]

On Sep  3 21:14, Corinna Vinschen wrote:
> On Sep  3 20:20, Thomas Wolff wrote:
> > Am 03.09.2018 um 19:56 schrieb Thomas Wolff:
> > > Am 03.09.2018 um 19:16 schrieb Corinna Vinschen:
> > > > On Sep  3 18:34, Thomas Wolff wrote:
> > > > > Am 03.09.2018 um 16:59 schrieb Corinna Vinschen:
> > > > > > Does anybody have an idea what I'm doing wrong?
> > > > > This works in mintty, just uploaded a patch. Maybe somehow the
> > > > > GetConsole
> > > > > "dc" does not support this usage?
> > > > ¯\_(ツ)_/¯
> > > Dito; hold on, sorry, your code does *not* work inside mintty.
> > > Mine looks a bit different and I thought to have manually verified it's
> > > functionally equivalent, but indeed there must be something fishy...
> > You still need to
> >   SelectObject(cdc, f);
> > where f is the HFONT of the font you want to check.
> > To compare, you may check out function win_check_glyphs in file wintext.c in
> > mintty.
> 
> Thanks but I don't know how to get a HFONT for the current console font.
> 
> In the meantime I figured out why my GetCurrentConsoleFontEx call
> failed with error 87:
> 
> When looking again I realized there's a member called cbSize.  The MSDN
> docs neglect to tell that the cbSize member has to be primed with
> sizeof(CONSOLE_FONT_INFOEX).  As soon as I tried that, the function
> succeeded.
> 
> Well, it's a start.  I now have the actual font name.  No idea how to
> get a HFONT from there, though.  From what I can tell ATM, I'd have to
> call CreateFont to get a new HFONT and then destroy it again after
> usage.  This looks pretty wasteful.

Well, it still doesn't work for me.  I now have the following code:

===================== SNIP ======================
#include <windows.h>
#include <stdio.h>
#include <wchar.h>

int
main ()
{
  static const wchar_t replacement_char[2] =
    {
      0xfffd, /* REPLACEMENT CHARACTER */
      0x2592  /* MEDIUM SHADE */
    };

  CONSOLE_FONT_INFOEX cfi;
  HWND cwnd = GetConsoleWindow ();
  HDC cdc = GetDC (cwnd);
  int rp_idx = 1;
  WORD gi[2] = { 0, 0 };

  memset (&cfi, 0, sizeof cfi);
  cfi.cbSize = sizeof cfi;
  if (GetCurrentConsoleFontEx (GetStdHandle (STD_OUTPUT_HANDLE), FALSE, &cfi))
    {
      printf ("font %ls\n", cfi.FaceName);
      HFONT hf = CreateFontW (cfi.dwFontSize.Y, cfi.dwFontSize.X,
			      0, 0, cfi.FontWeight, FALSE, FALSE, FALSE,
			      DEFAULT_CHARSET, OUT_DEFAULT_PRECIS,
			      CLIP_DEFAULT_PRECIS, DEFAULT_QUALITY,
			      FIXED_PITCH | FF_DONTCARE, cfi.FaceName);
      if (hf)
      	{
	  HFONT old_f = SelectObject(cdc, hf);
	  if (GetGlyphIndicesW (cdc, replacement_char, 2, gi,
				GGI_MARK_NONEXISTING_GLYPHS) != GDI_ERROR)
	    {
	      printf ("gi = %d %d\n", gi[0], gi[1]);
	      if (gi[0] != 0xffff)
		rp_idx = 0;
	    }
	  if (old_f)
	    old_f = SelectObject (cdc, old_f);
	  DeleteObject (hf);
	}
    }

  printf ("rp_idx = %d\n", rp_idx);
  return 0;
}
===================== SNAP ======================

Supposedly none of the fonts support 0xfffd:

  $ gcc -g -o cons cons.c -lgdi32
  $ ./cons
  font Consolas
  gi = 65535 879
  rp_idx = 1
  $ ./cons
  font Lucida Console
  gi = 65535 620
  rp_idx = 1
  $ ./cons
  font Courier New
  gi = 65535 372
  rp_idx = 1

So I'm still doing something wrong, apparently.  Any hint?


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 20:27                       ` Corinna Vinschen
@ 2018-09-03 20:42                         ` Thomas Wolff
  2018-09-03 21:03                           ` Corinna Vinschen
  0 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-03 20:42 UTC (permalink / raw)
  To: cygwin

Am 03.09.2018 um 22:27 schrieb Corinna Vinschen:
> On Sep  3 21:14, Corinna Vinschen wrote:
>> On Sep  3 20:20, Thomas Wolff wrote:
>>> Am 03.09.2018 um 19:56 schrieb Thomas Wolff:
>>>> Am 03.09.2018 um 19:16 schrieb Corinna Vinschen:
>>>>> On Sep  3 18:34, Thomas Wolff wrote:
>>>>>> Am 03.09.2018 um 16:59 schrieb Corinna Vinschen:
>>>>>>> Does anybody have an idea what I'm doing wrong?
>>>>>> This works in mintty, just uploaded a patch. Maybe somehow the
>>>>>> GetConsole
>>>>>> "dc" does not support this usage?
>>>>> ¯\_(ツ)_/¯
>>>> Dito; hold on, sorry, your code does *not* work inside mintty.
>>>> Mine looks a bit different and I thought to have manually verified it's
>>>> functionally equivalent, but indeed there must be something fishy...
>>> You still need to
>>>    SelectObject(cdc, f);
>>> where f is the HFONT of the font you want to check.
>>> To compare, you may check out function win_check_glyphs in file wintext.c in
>>> mintty.
>> Thanks but I don't know how to get a HFONT for the current console font.
>>
>> In the meantime I figured out why my GetCurrentConsoleFontEx call
>> failed with error 87:
>>
>> When looking again I realized there's a member called cbSize.  The MSDN
>> docs neglect to tell that the cbSize member has to be primed with
>> sizeof(CONSOLE_FONT_INFOEX).  As soon as I tried that, the function
>> succeeded.
>>
>> Well, it's a start.  I now have the actual font name.  No idea how to
>> get a HFONT from there, though.  From what I can tell ATM, I'd have to
>> call CreateFont to get a new HFONT and then destroy it again after
>> usage.  This looks pretty wasteful.
> Well, it still doesn't work for me.  I now have the following code:
>
> ===================== SNIP ======================
> #include <windows.h>
> #include <stdio.h>
> #include <wchar.h>
>
> int
> main ()
> {
>    static const wchar_t replacement_char[2] =
>      {
>        0xfffd, /* REPLACEMENT CHARACTER */
>        0x2592  /* MEDIUM SHADE */
>      };
>
>    CONSOLE_FONT_INFOEX cfi;
>    HWND cwnd = GetConsoleWindow ();
>    HDC cdc = GetDC (cwnd);
>    int rp_idx = 1;
>    WORD gi[2] = { 0, 0 };
>
>    memset (&cfi, 0, sizeof cfi);
>    cfi.cbSize = sizeof cfi;
>    if (GetCurrentConsoleFontEx (GetStdHandle (STD_OUTPUT_HANDLE), FALSE, &cfi))
>      {
>        printf ("font %ls\n", cfi.FaceName);
>        HFONT hf = CreateFontW (cfi.dwFontSize.Y, cfi.dwFontSize.X,
> 			      0, 0, cfi.FontWeight, FALSE, FALSE, FALSE,
> 			      DEFAULT_CHARSET, OUT_DEFAULT_PRECIS,
> 			      CLIP_DEFAULT_PRECIS, DEFAULT_QUALITY,
> 			      FIXED_PITCH | FF_DONTCARE, cfi.FaceName);
>        if (hf)
>        	{
> 	  HFONT old_f = SelectObject(cdc, hf);
> 	  if (GetGlyphIndicesW (cdc, replacement_char, 2, gi,
> 				GGI_MARK_NONEXISTING_GLYPHS) != GDI_ERROR)
> 	    {
> 	      printf ("gi = %d %d\n", gi[0], gi[1]);
> 	      if (gi[0] != 0xffff)
> 		rp_idx = 0;
> 	    }
> 	  if (old_f)
> 	    old_f = SelectObject (cdc, old_f);
> 	  DeleteObject (hf);
> 	}
>      }
>
>    printf ("rp_idx = %d\n", rp_idx);
>    return 0;
> }
> ===================== SNAP ======================
>
> Supposedly none of the fonts support 0xfffd:
>
>    $ gcc -g -o cons cons.c -lgdi32
>    $ ./cons
>    font Consolas
>    gi = 65535 879
>    rp_idx = 1
>    $ ./cons
>    font Lucida Console
>    gi = 65535 620
>    rp_idx = 1
>    $ ./cons
>    font Courier New
>    gi = 65535 372
>    rp_idx = 1
>
> So I'm still doing something wrong, apparently.  Any hint?
Test with a font that has the glyph; those 3 don't. Try DejaVu.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 20:42                         ` Thomas Wolff
@ 2018-09-03 21:03                           ` Corinna Vinschen
  2018-09-03 22:15                             ` Steven Penny
  0 siblings, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-03 21:03 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3519 bytes --]

On Sep  3 22:42, Thomas Wolff wrote:
> Am 03.09.2018 um 22:27 schrieb Corinna Vinschen:
> > On Sep  3 21:14, Corinna Vinschen wrote:
> > > On Sep  3 20:20, Thomas Wolff wrote:
> > > > Am 03.09.2018 um 19:56 schrieb Thomas Wolff:
> > > > > Am 03.09.2018 um 19:16 schrieb Corinna Vinschen:
> > > > > > On Sep  3 18:34, Thomas Wolff wrote:
> > > > > > > Am 03.09.2018 um 16:59 schrieb Corinna Vinschen:
> > > > > > > > Does anybody have an idea what I'm doing wrong?
> > > > > > > This works in mintty, just uploaded a patch. Maybe somehow the
> > > > > > > GetConsole
> > > > > > > "dc" does not support this usage?
> > > > > > ¯\_(ツ)_/¯
> > > > > Dito; hold on, sorry, your code does *not* work inside mintty.
> > > > > Mine looks a bit different and I thought to have manually verified it's
> > > > > functionally equivalent, but indeed there must be something fishy...
> > > > You still need to
> > > >    SelectObject(cdc, f);
> > > > where f is the HFONT of the font you want to check.
> > > > To compare, you may check out function win_check_glyphs in file wintext.c in
> > > > mintty.
> > > Thanks but I don't know how to get a HFONT for the current console font.
> > > 
> > > In the meantime I figured out why my GetCurrentConsoleFontEx call
> > > failed with error 87:
> > > 
> > > When looking again I realized there's a member called cbSize.  The MSDN
> > > docs neglect to tell that the cbSize member has to be primed with
> > > sizeof(CONSOLE_FONT_INFOEX).  As soon as I tried that, the function
> > > succeeded.
> > > 
> > > Well, it's a start.  I now have the actual font name.  No idea how to
> > > get a HFONT from there, though.  From what I can tell ATM, I'd have to
> > > call CreateFont to get a new HFONT and then destroy it again after
> > > usage.  This looks pretty wasteful.
> > Well, it still doesn't work for me.  I now have the following code:
> > 
> > ===================== SNIP ======================
> > [...]
> > ===================== SNAP ======================
> > 
> > Supposedly none of the fonts support 0xfffd:
> > 
> >    $ gcc -g -o cons cons.c -lgdi32
> >    $ ./cons
> >    font Consolas
> >    gi = 65535 879
> >    rp_idx = 1
> >    $ ./cons
> >    font Lucida Console
> >    gi = 65535 620
> >    rp_idx = 1
> >    $ ./cons
> >    font Courier New
> >    gi = 65535 372
> >    rp_idx = 1
> > 
> > So I'm still doing something wrong, apparently.  Any hint?
> Test with a font that has the glyph; those 3 don't. Try DejaVu.

I can't. I only have a limited set of fonts available in the console.

But yes, you're right.

What I just did was calling the GetFontUnicodeRanges function
for each font, and it turns out that none of the fonts support
0xfffd "REPLACEMENT CHARACTER", but all three support 0xfffc
"OBJECT REPLACEMENT CHARACTER".  I expanded the testcase to check
for this with GetGlyphIndicesW and, lo and behold, the result
makes sense.

On the other hand, during testing I saw a 0xfffd character printed for
these fonts.  None of them actually supports 0xfffd, so apparently the
Windows console already uses replacement fonts if possible.

I guess I just stop here and always print 0xfffd.  I seriously doubt
it makes sense to add so much code just to print a single char in a
border case.


Thanks a lot for your help,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 21:03                           ` Corinna Vinschen
@ 2018-09-03 22:15                             ` Steven Penny
  2018-09-04  6:06                               ` Brian Inglis
                                                 ` (2 more replies)
  0 siblings, 3 replies; 62+ messages in thread
From: Steven Penny @ 2018-09-03 22:15 UTC (permalink / raw)
  To: cygwin

On Mon, 3 Sep 2018 23:02:58, Corinna Vinschen wrote:
> I can't. I only have a limited set of fonts available in the console.

http://superuser.com/questions/390933/add-font-cmd-window-choices/956818

> What I just did was calling the GetFontUnicodeRanges function
> for each font, and it turns out that none of the fonts support
> 0xfffd "REPLACEMENT CHARACTER", but all three support 0xfffc
> "OBJECT REPLACEMENT CHARACTER".  I expanded the testcase to check
> for this with GetGlyphIndicesW and, lo and behold, the result
> makes sense.

Here is my code if it helps:

    #include <stdio.h>
    #include <windows.h>
    int main()
    {
      CONSOLE_FONT_INFOEX ta;
      ta.cbSize = sizeof ta;
      GetCurrentConsoleFontEx(GetStdHandle(STD_OUTPUT_HANDLE), 0, &ta);
      HDC wh = GetDC(0);
      SelectObject(wh,
        CreateFontW(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ta.FaceName));
      WCHAR xr = 0xFFFD;
      WORD zu[1];
      GetGlyphIndicesW(wh, &xr, 1, zu, 1);
      printf("%ls: %s\n", ta.FaceName, *zu == 0xFFFF ? "FAILURE" : "SUCCESS");
    }

Result:

    DejaVu Sans Mono: SUCCESS
    Consolas: FAILURE

> On the other hand, during testing I saw a 0xfffd character printed for
> these fonts.  None of them actually supports 0xfffd, so apparently the
> Windows console already uses replacement fonts if possible.
>
> I guess I just stop here and always print 0xfffd.  I seriously doubt
> it makes sense to add so much code just to print a single char in a
> border case.

this is not possible; most likely you were seeing the ".notdef glyph":

http://docs.microsoft.com/typography/opentype/spec/recom

for Consolas which is simlar in appearance to U+FFFD REPLACEMENT CHARACTER. The
differnce is that if you copy the ".notdef glyph" and paste it into "Notepad" or
similar, it will paste the proper character that couldnt be seen in the console,
while pasting U+FFFD into "Notepad" will just paste itself.

Expanding on the "Notepad" example, "Notepad" default font is "Lucida Console",
which doesnt have U+FFFD either. However pasting into "Notepad" will still show
U+FFFD properly because "Tahoma" has U+FFFD and "Notepad" can utilize composite
font, while it appears "cmd.exe" and similar cannot.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 22:15                             ` Steven Penny
@ 2018-09-04  6:06                               ` Brian Inglis
  2018-09-04  9:00                               ` Corinna Vinschen
  2018-10-04  0:25                               ` Steven Penny
  2 siblings, 0 replies; 62+ messages in thread
From: Brian Inglis @ 2018-09-04  6:06 UTC (permalink / raw)
  To: cygwin

On 2018-09-03 16:15, Steven Penny wrote:
> On Mon, 3 Sep 2018 23:02:58, Corinna Vinschen wrote:
>> I can't. I only have a limited set of fonts available in the console.

Install dejavu-fonts package or just DejaVu Sans Mono font from:

	https://dejavu-fonts.github.io/Download.html
	http://sourceforge.net/projects/dejavu/files/dejavu/2.37/dejavu-fonts-ttf-2.37.tar.bz2

or see what glyph is at index 0 (.notdef)?

> http://superuser.com/questions/390933/add-font-cmd-window-choices/956818

For Windows support, from Explorer I just search all *.[ot]tf under
...CygRoot.../usr/share/fonts/ and copy into /Windows/Fonts/

>> What I just did was calling the GetFontUnicodeRanges function
>> for each font, and it turns out that none of the fonts support
>> 0xfffd "REPLACEMENT CHARACTER", but all three support 0xfffc
>> "OBJECT REPLACEMENT CHARACTER".  I expanded the testcase to check
>> for this with GetGlyphIndicesW and, lo and behold, the result
>> makes sense.

>> On the other hand, during testing I saw a 0xfffd character printed for
>> these fonts.  None of them actually supports 0xfffd, so apparently the
>> Windows console already uses replacement fonts if possible.
>> I guess I just stop here and always print 0xfffd.  I seriously doubt
>> it makes sense to add so much code just to print a single char in a
>> border case.

> this is not possible; most likely you were seeing the ".notdef glyph":
> http://docs.microsoft.com/typography/opentype/spec/recom
> for Consolas which is simlar in appearance to U+FFFD REPLACEMENT CHARACTER. The
> differnce is that if you copy the ".notdef glyph" and paste it into "Notepad" or
> similar, it will paste the proper character that couldnt be seen in the console,
> while pasting U+FFFD into "Notepad" will just paste itself.
> Expanding on the "Notepad" example, "Notepad" default font is "Lucida Console",
> which doesnt have U+FFFD either. However pasting into "Notepad" will still show
> U+FFFD properly because "Tahoma" has U+FFFD and "Notepad" can utilize composite
> font, while it appears "cmd.exe" and similar cannot.

You can use Windows font linking to use glyphs from linked fonts like:

. GNU Unifont showing bitmap glyphs for BMP code points - release 8.0.1 is
available in Cygwin package unifont-fonts - latest below is 11.0.2

	https://savannah.gnu.org/projects/unifont
	http://unifoundry.com/unifont/index.html

. Evertype Last Resort font provided by Apple showing standard representative
Unicode block glyphs with the code point in the wide glyph border

	https://www.unicode.org/policies/lastresortfont_eula.html

. SIL Fallback showing the BMP code point inside a box

	https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=UnicodeBMPFallbackFont

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 22:15                             ` Steven Penny
  2018-09-04  6:06                               ` Brian Inglis
@ 2018-09-04  9:00                               ` Corinna Vinschen
  2018-09-04 11:40                                 ` Steven Penny
                                                   ` (2 more replies)
  2018-10-04  0:25                               ` Steven Penny
  2 siblings, 3 replies; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-04  9:00 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2679 bytes --]

On Sep  3 15:15, Steven Penny wrote:
> On Mon, 3 Sep 2018 23:02:58, Corinna Vinschen wrote:
> > I can't. I only have a limited set of fonts available in the console.
> 
> http://superuser.com/questions/390933/add-font-cmd-window-choices/956818
> 
> > What I just did was calling the GetFontUnicodeRanges function
> > for each font, and it turns out that none of the fonts support
> > 0xfffd "REPLACEMENT CHARACTER", but all three support 0xfffc
> > "OBJECT REPLACEMENT CHARACTER".  I expanded the testcase to check
> > for this with GetGlyphIndicesW and, lo and behold, the result
> > makes sense.
> 
> Here is my code if it helps:
> 
>    #include <stdio.h>
>    #include <windows.h>
>    int main()
>    {
>      CONSOLE_FONT_INFOEX ta;
>      ta.cbSize = sizeof ta;
>      GetCurrentConsoleFontEx(GetStdHandle(STD_OUTPUT_HANDLE), 0, &ta);
>      HDC wh = GetDC(0);
>      SelectObject(wh,
>        CreateFontW(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ta.FaceName));
>      WCHAR xr = 0xFFFD;
>      WORD zu[1];
>      GetGlyphIndicesW(wh, &xr, 1, zu, 1);
>      printf("%ls: %s\n", ta.FaceName, *zu == 0xFFFF ? "FAILURE" : "SUCCESS");
>    }

In how far does that add information to the code I posted in
https://cygwin.com/ml/cygwin/2018-09/msg00056.html ?

> Result:
> 
>    DejaVu Sans Mono: SUCCESS

Whereever you get DejaVu Sans Mono from.  My W10 console only allows to
specify a handful of fonts, Consolas, Courier New, Lucida, MS Gothic,
NSimSun, Raster Fonts, SimSun-ExtB.

>    Consolas: FAILURE
> 
> > On the other hand, during testing I saw a 0xfffd character printed for
> > these fonts.  None of them actually supports 0xfffd, so apparently the
> > Windows console already uses replacement fonts if possible.
> > 
> > I guess I just stop here and always print 0xfffd.  I seriously doubt
> > it makes sense to add so much code just to print a single char in a
> > border case.
> 
> this is not possible; most likely you were seeing the ".notdef glyph":
> 
> http://docs.microsoft.com/typography/opentype/spec/recom

Yeah, that's it then.  Whatever.  The fact that none of the default
fonts available for the console provide 0xfffd REPLACEMENT CHARACTER
doesn't really contribute to my willingness to add lots of code for
a border case.

We either keep 0xfffd now and the user gets the nodef glyph, or I revert
the patch and let the console print 0x2592 MEDIUM SHADE again.

Decision has to be made today.  I will release 2.11.1 tomorrow.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04  9:00                               ` Corinna Vinschen
@ 2018-09-04 11:40                                 ` Steven Penny
  2018-09-05  7:55                                   ` Corinna Vinschen
  2018-09-04 12:50                                 ` David Macek
  2018-09-04 13:05                                 ` Andrey Repin
  2 siblings, 1 reply; 62+ messages in thread
From: Steven Penny @ 2018-09-04 11:40 UTC (permalink / raw)
  To: cygwin

On Tue, 4 Sep 2018 11:00:00, Corinna Vinschen wrote:
> Whereever you get DejaVu Sans Mono from.

Cygwin provides it via the "dejavu-fonts" package, or you can get it here:

http://dejavu-fonts.github.io

> My W10 console only allows to specify a handful of fonts, Consolas, Courier
> New, Lucida, MS Gothic, NSimSun, Raster Fonts, SimSun-ExtB.

You can add DejaVu or others like this:

http://superuser.com/questions/390933/add-font-cmd-window-choices/956818

> Yeah, that's it then.  Whatever.  The fact that none of the default
> fonts available for the console provide 0xfffd REPLACEMENT CHARACTER
> doesn't really contribute to my willingness to add lots of code for
> a border case.
>
> We either keep 0xfffd now and the user gets the nodef glyph, or I revert
> the patch and let the console print 0x2592 MEDIUM SHADE again.
>
> Decision has to be made today.  I will release 2.11.1 tomorrow.

I prefer to you keep the patch that has been committed already. I was never a
fan of falling back to U+2592, but since we have the code for that now its your
call.

Cheers


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04  9:00                               ` Corinna Vinschen
  2018-09-04 11:40                                 ` Steven Penny
@ 2018-09-04 12:50                                 ` David Macek
  2018-09-04 14:18                                   ` Thomas Wolff
  2018-09-04 13:05                                 ` Andrey Repin
  2 siblings, 1 reply; 62+ messages in thread
From: David Macek @ 2018-09-04 12:50 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 511 bytes --]

On 4. 9. 2018 11:00, Corinna Vinschen wrote:
> We either keep 0xfffd now and the user gets the nodef glyph, or I revert
> the patch and let the console print 0x2592 MEDIUM SHADE again.
> 
> Decision has to be made today.  I will release 2.11.1 tomorrow.

I vote for keeping the patch and printing 0xFFFD.  It's okay in the default case,
it's exactly what was requested in the non-standard font case and it's future
proof in case ConHost implements rendering using fallback fonts.

-- 
David Macek


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4002 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04  9:00                               ` Corinna Vinschen
  2018-09-04 11:40                                 ` Steven Penny
  2018-09-04 12:50                                 ` David Macek
@ 2018-09-04 13:05                                 ` Andrey Repin
  2 siblings, 0 replies; 62+ messages in thread
From: Andrey Repin @ 2018-09-04 13:05 UTC (permalink / raw)
  To: Corinna Vinschen, cygwin

Greetings, Corinna Vinschen!

>> Result:
>> 
>>    DejaVu Sans Mono: SUCCESS

> Whereever you get DejaVu Sans Mono from.  My W10 console only allows to
> specify a handful of fonts, Consolas, Courier New, Lucida, MS Gothic,
> NSimSun, Raster Fonts, SimSun-ExtB.

Something like

printf "DejaVu Sans Mono" > "/proc/registry/HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/Windows NT/CurrentVersion/Console/TrueTypeFont/00000"

should work. Make sure the number of "0"s is different from any existing
entries.


-- 
With best regards,
Andrey Repin
Tuesday, September 4, 2018 16:00:16

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 12:50                                 ` David Macek
@ 2018-09-04 14:18                                   ` Thomas Wolff
  2018-09-04 14:46                                     ` David Macek
  2018-09-04 18:20                                     ` Steven Penny
  0 siblings, 2 replies; 62+ messages in thread
From: Thomas Wolff @ 2018-09-04 14:18 UTC (permalink / raw)
  To: cygwin


On 04.09.2018 14:49, David Macek wrote:
> On 4. 9. 2018 11:00, Corinna Vinschen wrote:
>> We either keep 0xfffd now and the user gets the nodef glyph, or I revert
>> the patch and let the console print 0x2592 MEDIUM SHADE again.
>>
>> Decision has to be made today.  I will release 2.11.1 tomorrow.
>
> I vote for keeping the patch and printing 0xFFFD.  It's okay in the 
> default case,
> it's exactly what was requested in the non-standard font case and it's 
> future
> proof in case ConHost implements rendering using fallback fonts.
>
My vote is against the patch because the nodef glyph will often be just 
blank space which is certainly worse than â–’.
If conhost does not provide a reasonable way to enquire 0xFFFD 
availability it's conhost's fault, not cygwin's so why should cygwin 
implement a bad compromise. If conhost ever improves, cygwin can adapt.
Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 14:18                                   ` Thomas Wolff
@ 2018-09-04 14:46                                     ` David Macek
  2018-09-04 18:20                                     ` Steven Penny
  1 sibling, 0 replies; 62+ messages in thread
From: David Macek @ 2018-09-04 14:46 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 315 bytes --]

On 4. 9. 2018 16:18, Thomas Wolff wrote:
> My vote is against the patch because the nodef glyph will often be just blank space which is certainly worse than ▒.

How often is "often"?  Do the default Windows fonts have okay nodef glyphs?

By the way, how does this work with OEM fonts?

-- 
David Macek


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4002 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 14:18                                   ` Thomas Wolff
  2018-09-04 14:46                                     ` David Macek
@ 2018-09-04 18:20                                     ` Steven Penny
  2018-09-04 18:41                                       ` Thomas Wolff
  2018-09-04 20:40                                       ` Brian Inglis
  1 sibling, 2 replies; 62+ messages in thread
From: Steven Penny @ 2018-09-04 18:20 UTC (permalink / raw)
  To: cygwin

On Tue, 4 Sep 2018 16:18:21, Thomas Wolff wrote:
> My vote is against the patch because the nodef glyph will often be just 
> blank space which is certainly worse than â–’.
> If conhost does not provide a reasonable way to enquire 0xFFFD 
> availability it's conhost's fault, not cygwin's so why should cygwin 
> implement a bad compromise. If conhost ever improves, cygwin can adapt.

This is some dangerous commentary. I would like to counter it now with some
actual research. Using BabelMap:

http://babelstone.co.uk/Software/BabelMap.html

You can do "Fonts", "Font Coverage" and you will get this result with code point
FFFD:

    yes: DejaVu Sans Mono

    no:
    - Consolas
    - Courier New
    - Lucida Console
    - MS Gothic
    - NSimSun
    - SimSun-ExtB

This is concerning true, but we can then review the ".notdef glyph" for the
problem fonts. As this glyph is not an actual character, i cant paste it here,
but i will describe them below:


    empty rectangle:
    - Courier New
    - Lucida Console
    - MS Gothic
    - SimSun-ExtB

    rectangle with a question mark inside: Consolas

    none: NSimSun

Note that I did not include "Raster Fonts", as it doesnt even allow multibyte
characters:

    $ printf '\xC2\xA1\n'
    sh: printf: write error: Permission denied


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 18:20                                     ` Steven Penny
@ 2018-09-04 18:41                                       ` Thomas Wolff
  2018-09-04 19:50                                         ` Andrey Repin
  2018-09-04 19:53                                         ` Steven Penny
  2018-09-04 20:40                                       ` Brian Inglis
  1 sibling, 2 replies; 62+ messages in thread
From: Thomas Wolff @ 2018-09-04 18:41 UTC (permalink / raw)
  To: cygwin

Am 04.09.2018 um 20:20 schrieb Steven Penny:
> On Tue, 4 Sep 2018 16:18:21, Thomas Wolff wrote:
>> My vote is against the patch because the nodef glyph will often be 
>> just blank space which is certainly worse than â–’.
>> If conhost does not provide a reasonable way to enquire 0xFFFD 
>> availability it's conhost's fault, not cygwin's so why should cygwin 
>> implement a bad compromise. If conhost ever improves, cygwin can adapt.
>
> This is some dangerous commentary. I would like to counter it now with 
> some actual research.
No idea what you consider dangerous. Anyway, we obviously agree that 
hardly any available console font supports the REPLACEMENT CHARACTER. 
You had previously suggested code that might work (using CreateFont(0, 
0, ....)). Maybe you can sort out with Corinna how to get that work 
inside cygwin. Otherwise, my opinion:
- *working* fallback from FFFD to 2592: good
- revert to 2592: OK
- fix FFFD: not good, because the .notdef glyph is not an appropriate 
indication of illegal encoding (like broken UTF-8 bytes)
Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 18:41                                       ` Thomas Wolff
@ 2018-09-04 19:50                                         ` Andrey Repin
  2018-09-04 19:53                                         ` Steven Penny
  1 sibling, 0 replies; 62+ messages in thread
From: Andrey Repin @ 2018-09-04 19:50 UTC (permalink / raw)
  To: Thomas Wolff, cygwin

Greetings, Thomas Wolff!

>>> My vote is against the patch because the nodef glyph will often be
>>> just blank space which is certainly worse than ▒.
>>> If conhost does not provide a reasonable way to enquire 0xFFFD 
>>> availability it's conhost's fault, not cygwin's so why should cygwin 
>>> implement a bad compromise. If conhost ever improves, cygwin can adapt.
>>
>> This is some dangerous commentary. I would like to counter it now with 
>> some actual research.
> No idea what you consider dangerous. Anyway, we obviously agree that 
> hardly any available console font supports the REPLACEMENT CHARACTER.

If by "console" you mean "raster", then terminal simply unable to render
U+FFFD in raster font mode.

I.e.

$ php -r 'print "\u{FFFD}\n";' | cat -
cat: write error: Permission denied

This is regardless of selected codepage+locale.

> You had previously suggested code that might work (using CreateFont(0, 
> 0, ....)). Maybe you can sort out with Corinna how to get that work 
> inside cygwin. Otherwise, my opinion:
> - *working* fallback from FFFD to 2592: good

Neither that works.

$ php -r 'print "\u{2592}\n";' | cat -
cat: write error: Permission denied

> - revert to 2592: OK
> - fix FFFD: not good, because the .notdef glyph is not an appropriate 
> indication of illegal encoding (like broken UTF-8 bytes)

For both Consolas and Lucida Console, U+FFFD displays sensible presentation in
terminal.
May be less sensible for Lucida Console. But it is still immediately
recognizable for anybody who had seen unknown character glyphs before.
And if Microsoft gets better, it will be only better with no additional effort.

Whereas U+2592
1. unrecognizable.
2. may actually appear in legitimate output.


-- 
With best regards,
Andrey Repin
Tuesday, September 4, 2018 22:10:29

Sorry for my terrible english...
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 18:41                                       ` Thomas Wolff
  2018-09-04 19:50                                         ` Andrey Repin
@ 2018-09-04 19:53                                         ` Steven Penny
  2018-09-04 21:43                                           ` Thomas Wolff
  1 sibling, 1 reply; 62+ messages in thread
From: Steven Penny @ 2018-09-04 19:53 UTC (permalink / raw)
  To: cygwin

On Tue, 4 Sep 2018 20:41:48, Thomas Wolff wrote:
> No idea what you consider dangerous. Anyway, we obviously agree that 
> hardly any available console font supports the REPLACEMENT CHARACTER. 
> You had previously suggested code that might work (using CreateFont(0, 
> 0, ....)). Maybe you can sort out with Corinna how to get that work 
> inside cygwin. Otherwise, my opinion:
> - *working* fallback from FFFD to 2592: good

i am fine with this, but i think corinna feels it is too much code for not
enough benefit - thats her decision.

> - fix FFFD: not good, because the .notdef glyph is not an appropriate 
> indication of illegal encoding (like broken UTF-8 bytes)

not sure what you even mean by this - FFFD doesnt need fixing - Windows just
need to adopt some fonts with proper unicode support. we are dealing with their
lack of doing that.

> the .notdef glyph is not an appropriate indication of illegal encoding (like
> broken UTF-8 bytes)

true, but neither is U+2592. as far as i know U+2592 is not defined officially
anywhere as being a representation of anything other than "MEDIUM SHADE".
Corinna originally added it in 2009:

http://cygwin.com/git/gitweb.cgi?p=newlib-cygwin.git&a=commitdiff&h=161211d

with no justification of why it was chosen that i can tell. similarly, mintty
actually changed from U+FFFD to U+2592 in 2009:

http://github.com/mintty/mintty/commit/90c11d3

with actually a good reason, which was to avoid ambiguity with fonts that didnt
have U+FFFD. but again, no reason why U+2592 was chosen. i personally see both
sides of the argument but i tend to land of the side of any standards if they
exist. Here is the standard for U+FFFD:

http://unicode.org/charts/nameslist/n_FFF0.html

> - revert to 2592: OK

if we were to use something other than U+FFFD, I would propose U+25A1, as it is
also defined by Unicode:

    25A1	 â–¡ 	White Square
    •	may be used to represent a missing ideograph

http://unicode.org/charts/nameslist/n_25A0.html

and it has better support than U+FFFD:

    yes:
    - Consolas
    - Courier New
    - DejaVu Sans Mono
    - MS Gothic
    - NSimSun

    no:
    - Lucida Console
    - SimSun-ExtB


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-01 16:13 Cygwin fails to utilize Unicode replacement character Steven Penny
                   ` (2 preceding siblings ...)
  2018-09-01 21:50 ` Doug Henderson
@ 2018-09-04 19:59 ` Doug Henderson
  2018-09-04 21:05   ` Steven Penny
  3 siblings, 1 reply; 62+ messages in thread
From: Doug Henderson @ 2018-09-04 19:59 UTC (permalink / raw)
  To: cygwin

On Sat, 1 Sep 2018 at 10:13, Steven Penny  wrote:
<snip>
> You get this result with Linux:
>
>     $ cat alfa.txt
>     �
>
> Where "cat" properly outputs Unicode 'REPLACEMENT CHARACTER' (U+FFFD). However
> with Cygwin you get this:
>
>     $ cat alfa.txt
>     ▒
>
> Where "cat" outputs Unicode Character 'MEDIUM SHADE' (U+2592).


My preference is to remove the output fiddling code that Corrina has
been working on. It is trying to solve the wrong problem.
I think we have gone down a rabbit hole at the wrong end of cat's data flow.

Should any changes to the way a character is displayed be required, it
needs to be in the terminal program that display the character, not in
cygwin which should pass the character along unmodified.

Both cygwin and Debian 9.5 show:

    $ file alfa.txt
    alfa.txt: ISO-8859 text

When Linux reads the file, it assumes the encoding is UTF-8.
When cygwin reads the file, it assume the encoding is CP1252
This command shows the problem

    $ iconv -f utf8 alfa.txt
    iconv: alfa.txt:1:0: incomplete character or shift sequence

On Linux, this shows a slightly different message, with the same intent.

Try using this string:

    $ printf "\xC3\xAB\353\n"
    ë▒

to get a better understanding of the problem. It contains two
representation of LATIN SMALL LETTER E WITH DIAERESIS, first encoded
in UTF-8, then using ISO-8859-1.

There are two different reasons for the MEDIUM SHADE. Here it
indicates an invalid UTF-8 character, and the font does not have a
glyph for REPLACEMENT CHARACTER. The MEDIUM SHADE is also used in
place of an ordinary character without a glyph in the font.

HTH
Doug

-- 
Doug Henderson, Calgary, Alberta, Canada - from gmail.com

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 18:20                                     ` Steven Penny
  2018-09-04 18:41                                       ` Thomas Wolff
@ 2018-09-04 20:40                                       ` Brian Inglis
  2018-09-05  8:32                                         ` Corinna Vinschen
  1 sibling, 1 reply; 62+ messages in thread
From: Brian Inglis @ 2018-09-04 20:40 UTC (permalink / raw)
  To: cygwin

On 2018-09-04 12:20, Steven Penny wrote:
> On Tue, 4 Sep 2018 16:18:21, Thomas Wolff wrote:
>> My vote is against the patch because the nodef glyph will often be just blank
>> space which is certainly worse than â–’.

Not according to the sample below: you would have to know that medium shade
means unavailable.

>> If conhost does not provide a reasonable way to enquire 0xFFFD availability
>> it's conhost's fault, not cygwin's so why should cygwin implement a bad
>> compromise. If conhost ever improves, cygwin can adapt.
> This is some dangerous commentary. I would like to counter it now with some
> actual research. Using BabelMap:
> http://babelstone.co.uk/Software/BabelMap.html
> You can do "Fonts", "Font Coverage" and you will get this result with code point
> FFFD:
>    yes: DejaVu Sans Mono
>    no:
>    - Consolas
>    - Courier New
>    - Lucida Console
>    - MS Gothic
>    - NSimSun
>    - SimSun-ExtB
> This is concerning true, but we can then review the ".notdef glyph" for the
> problem fonts. As this glyph is not an actual character, i cant paste it here,
> but i will describe them below:
>    empty rectangle:
>    - Courier New
>    - Lucida Console
>    - MS Gothic
>    - SimSun-ExtB
>    rectangle with a question mark inside: Consolas

These are both recommended .notdef glyphs.

>    none: NSimSun

Valid OTF and TTF fonts must have a glyph with index entry 0 used for .notdef.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 19:59 ` Doug Henderson
@ 2018-09-04 21:05   ` Steven Penny
  0 siblings, 0 replies; 62+ messages in thread
From: Steven Penny @ 2018-09-04 21:05 UTC (permalink / raw)
  To: cygwin

On Tue, 4 Sep 2018 13:59:10, Doug Henderson wrote:
> My preference is to remove the output fiddling code that Corrina has
> been working on. It is trying to solve the wrong problem.
> I think we have gone down a rabbit hole at the wrong end of cat's data flow.

this has nothing to do with "cat". it has to do with the unfounded design
decision to use U+2592. Granted at this point we are bikeshedding - but an
official standard does exist, namely Unicode, with 2 applicable characters for
this use case:

1. U+FFFD: http://unicode.org/charts/nameslist/n_FFF0.html
2. U+25A1: http://unicode.org/charts/nameslist/n_25A0.html

> Should any changes to the way a character is displayed be required, it
> needs to be in the terminal program that display the character, not in
> cygwin which should pass the character along unmodified.

the "terminal" in this case is either "cygwin" or "xterm" - in both cases code
changes have already been made in reponse to this thread, so i dont think your
comment here holds weight.

> Both cygwin and Debian 9.5 show:
>
>     $ file alfa.txt
>     alfa.txt: ISO-8859 text
>
> When Linux reads the file, it assumes the encoding is UTF-8.
> When cygwin reads the file, it assume the encoding is CP1252
> This command shows the problem
>
>     $ iconv -f utf8 alfa.txt
>     iconv: alfa.txt:1:0: incomplete character or shift sequence
>
> On Linux, this shows a slightly different message, with the same intent.
>
> Try using this string:
>
>     $ printf "\xC3\xAB\353\n"
>     =C3=AB=E2=96=92
>
> to get a better understanding of the problem. It contains two
> representation of LATIN SMALL LETTER E WITH DIAERESIS, first encoded
> in UTF-8, then using ISO-8859-1.

now it appears *you* are going down the rabbit hole. both Cygwin and Mintty were
in violation on Unicode standard - however this has already been remedied in the
code.

> There are two different reasons for the MEDIUM SHADE. Here it
> indicates an invalid UTF-8 character, and the font does not have a
> glyph for REPLACEMENT CHARACTER. The MEDIUM SHADE is also used in
> place of an ordinary character without a glyph in the font.

this is flat wrong. U+2592 MEDIUM SHADE is *only* used in cases of invalid
UTF-8. In case of missing character - the ".notdef" glyph is used - as has been
discussed several times in this thread. This is not an actual character, so i
cannot paste it here - but as an example with "DejaVu Sans Mono" the glyph is
an empty rectangle.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 19:53                                         ` Steven Penny
@ 2018-09-04 21:43                                           ` Thomas Wolff
  2018-09-04 23:29                                             ` Steven Penny
  0 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-04 21:43 UTC (permalink / raw)
  To: cygwin

Am 04.09.2018 um 21:53 schrieb Steven Penny:
> On Tue, 4 Sep 2018 20:41:48, Thomas Wolff wrote:
> ...
>> the .notdef glyph is not an appropriate indication of illegal 
>> encoding (like broken UTF-8 bytes)
>
> true, but neither is U+2592. as far as i know U+2592 is not defined 
> officially
> anywhere as being a representation of anything other than "MEDIUM SHADE".
Traditionally, many terminals used to display the DEL character as a 
checkered block, which is more or less the MEDIUM SHADE.
This makes the glyph appear somewhat "erroneous" by convention.

> Corinna originally added it in 2009:
>
> http://cygwin.com/git/gitweb.cgi?p=newlib-cygwin.git&a=commitdiff&h=161211d 
>
>
> with no justification of why it was chosen that i can tell.
Justification is traditional usage of the symbol as described above.

> similarly, mintty
> actually changed from U+FFFD to U+2592 in 2009:
>
> http://github.com/mintty/mintty/commit/90c11d3
>
> with actually a good reason, which was to avoid ambiguity with fonts 
> that didnt
> have U+FFFD. but again, no reason why U+2592 was chosen. i personally 
> see both
> sides of the argument but i tend to land of the side of any standards 
> if they
> exist.

> Here is the standard for U+FFFD:
>
> http://unicode.org/charts/nameslist/n_FFF0.html
FFFD     �     Replacement Character
           •    used to replace an incoming character whose value is 
unknown or unrepresentable in Unicode
>
> if we were to use something other than U+FFFD, I would propose U+25A1, 
> as it is
> also defined by Unicode:
>
>    25A1     □     White Square
>    •    may be used to represent a missing ideograph
>
> http://unicode.org/charts/nameslist/n_25A0.html
Quoting yourself from your other response:
> U+2592 MEDIUM SHADE is *only* used in cases of invalid UTF-8. In case 
> of missing character - the ".notdef" glyph is used
This is my point. We have two use cases here:
invalid code point -> MEDIUM SHADE
valid code point with no glyph in font -> .notdef glyph -> WHITE SQUARE
Now if you switch to FFFD REPLACEMENT CHARACTER for invalid code point, 
and considering that it does not exist in most actual fonts and that the 
console does not apply font fallback, it will resolve to WHITE SQUARE, thus:
folding the two different use cases into the same appearance,
which is bad.
Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 21:43                                           ` Thomas Wolff
@ 2018-09-04 23:29                                             ` Steven Penny
  0 siblings, 0 replies; 62+ messages in thread
From: Steven Penny @ 2018-09-04 23:29 UTC (permalink / raw)
  To: cygwin

On Tue, 4 Sep 2018 23:43:16, Thomas Wolff wrote:
> Traditionally, many terminals used to display the DEL character as a 
> checkered block, which is more or less the MEDIUM SHADE.
> This makes the glyph appear somewhat "erroneous" by convention.

I see - now that Unicode has some dedicated characters for this, it would make
sense to use them, especially since linux is already using them:

1. U+FFFD: http://unicode.org/charts/nameslist/n_FFF0.html
2. U+25A1: http://unicode.org/charts/nameslist/n_25A0.html

> valid code point with no glyph in font -> .notdef glyph -> WHITE SQUARE

this is not true. "WHITE SQUARE" refers to U+25A1, which is an actual character
and different from the ".notdef" glyph. as has been discussed as length in this
thread, the ".notdef glyph" is not an actual character, but a glyph that exists
at position 0 in the font, and while its appearance is not strictly defined,
some recommendations exist:

- empty rectangle
- rectangle with a question mark
- rectangle with an X

> Now if you switch to FFFD REPLACEMENT CHARACTER for invalid code point, 
> and considering that it does not exist in most actual fonts and that the 
> console does not apply font fallback, it will resolve to WHITE SQUARE, thus:
> folding the two different use cases into the same appearance,
> which is bad.

no again, it will resolve to ".notdef glyph", as I put above. otherwise yes, you
do have a point. in the case of a font without U+FFFD, you have ultimately:

invalid code point: .notdef glyph
missing character: .notdef glyph

several ideas have been proposed:

1. keep U+FFFD
2. go back to U+2592
3. use U+25A1 instead
4. use U+FFFD if possible else fallback to U+2592 or U+25A1

if we choose option 1, people not happy with the ambiguity can simply install
"dejavu-fonts" or similar, which Cygwin provides.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 11:40                                 ` Steven Penny
@ 2018-09-05  7:55                                   ` Corinna Vinschen
  2018-09-05  9:22                                     ` Thomas Wolff
  2018-09-05 11:58                                     ` Steven Penny
  0 siblings, 2 replies; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-05  7:55 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1358 bytes --]

On Sep  4 04:40, Steven Penny wrote:
> On Tue, 4 Sep 2018 11:00:00, Corinna Vinschen wrote:
> > Whereever you get DejaVu Sans Mono from.
> 
> Cygwin provides it via the "dejavu-fonts" package, or you can get it here:
> 
> http://dejavu-fonts.github.io
> 
> > My W10 console only allows to specify a handful of fonts, Consolas, Courier
> > New, Lucida, MS Gothic, NSimSun, Raster Fonts, SimSun-ExtB.
> 
> You can add DejaVu or others like this:
> 
> http://superuser.com/questions/390933/add-font-cmd-window-choices/956818

I added DejaVu Sans Mono per the above and to my surprise I see this:

  $ cat alfa.txt
  �

So it looks like Deja Vu has a 0xfffd char.  However, GetGlyphIndicesW
claims otherwise:

  static const wchar_t replacement_char[3] =
    {
      0xfffd, /* REPLACEMENT CHARACTER */
      0x25a1, /* WHITE SQUARE */
      0x2592  /* MEDIUM SHADE */
    };
  WORD gi[3] = { 0, 0, 0 };
  [...]
  GetGlyphIndicesW (cdc, replacement_char, 3, gi, GGI_MARK_NONEXISTING_GLYPHS);
  printf ("gi = %u %u %u\n", gi[0], gi[1], gi[2]);

This prints:

  gi = 65535 401 372

That means, the notdef glyph for DejaVu looks like 0xfffd, but isn't,
right?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-04 20:40                                       ` Brian Inglis
@ 2018-09-05  8:32                                         ` Corinna Vinschen
  0 siblings, 0 replies; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-05  8:32 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1942 bytes --]

On Sep  4 14:40, Brian Inglis wrote:
> On 2018-09-04 12:20, Steven Penny wrote:
> > On Tue, 4 Sep 2018 16:18:21, Thomas Wolff wrote:
> >> My vote is against the patch because the nodef glyph will often be just blank
> >> space which is certainly worse than ▒.
> 
> Not according to the sample below: you would have to know that medium shade
> means unavailable.
> 
> >> If conhost does not provide a reasonable way to enquire 0xFFFD availability
> >> it's conhost's fault, not cygwin's so why should cygwin implement a bad
> >> compromise. If conhost ever improves, cygwin can adapt.
> > This is some dangerous commentary. I would like to counter it now with some
> > actual research. Using BabelMap:
> > http://babelstone.co.uk/Software/BabelMap.html
> > You can do "Fonts", "Font Coverage" and you will get this result with code point
> > FFFD:
> >    yes: DejaVu Sans Mono
> >    no:
> >    - Consolas
> >    - Courier New
> >    - Lucida Console
> >    - MS Gothic
> >    - NSimSun
> >    - SimSun-ExtB
> > This is concerning true, but we can then review the ".notdef glyph" for the
> > problem fonts. As this glyph is not an actual character, i cant paste it here,
> > but i will describe them below:
> >    empty rectangle:
> >    - Courier New
> >    - Lucida Console
> >    - MS Gothic
> >    - SimSun-ExtB
> >    rectangle with a question mark inside: Consolas
> 
> These are both recommended .notdef glyphs.
> 
> >    none: NSimSun
> 
> Valid OTF and TTF fonts must have a glyph with index entry 0 used for .notdef.

Discussion closed for 2.11.1.  I'm going to release it as is, with
0xfffd as replacement char.

A better/more complex solution will have to go into the next release.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05  7:55                                   ` Corinna Vinschen
@ 2018-09-05  9:22                                     ` Thomas Wolff
  2018-09-05 11:58                                     ` Steven Penny
  1 sibling, 0 replies; 62+ messages in thread
From: Thomas Wolff @ 2018-09-05  9:22 UTC (permalink / raw)
  To: cygwin

Am 05.09.2018 um 09:55 schrieb Corinna Vinschen:
> On Sep  4 04:40, Steven Penny wrote:
>> On Tue, 4 Sep 2018 11:00:00, Corinna Vinschen wrote:
>>> Whereever you get DejaVu Sans Mono from.
>> Cygwin provides it via the "dejavu-fonts" package, or you can get it here:
>>
>> http://dejavu-fonts.github.io
>>
>>> My W10 console only allows to specify a handful of fonts, Consolas, Courier
>>> New, Lucida, MS Gothic, NSimSun, Raster Fonts, SimSun-ExtB.
>> You can add DejaVu or others like this:
>>
>> http://superuser.com/questions/390933/add-font-cmd-window-choices/956818
> I added DejaVu Sans Mono per the above and to my surprise I see this:
>
>    $ cat alfa.txt
>    �
>
> So it looks like Deja Vu has a 0xfffd char.  However, GetGlyphIndicesW
> claims otherwise:
>
>    static const wchar_t replacement_char[3] =
>      {
>        0xfffd, /* REPLACEMENT CHARACTER */
>        0x25a1, /* WHITE SQUARE */
>        0x2592  /* MEDIUM SHADE */
>      };
>    WORD gi[3] = { 0, 0, 0 };
>    [...]
>    GetGlyphIndicesW (cdc, replacement_char, 3, gi, GGI_MARK_NONEXISTING_GLYPHS);
>    printf ("gi = %u %u %u\n", gi[0], gi[1], gi[2]);
>
> This prints:
>
>    gi = 65535 401 372
>
> That means, the notdef glyph for DejaVu looks like 0xfffd, but isn't, right?
I guess it means that (or something subtle related to font-fallback 
although we previously concluded the console wouldn't support it...).
My vote remains for going back to MEDIUM SHADE, for 2.11.2 then..., 
unless we find a working detection function.
Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05  7:55                                   ` Corinna Vinschen
  2018-09-05  9:22                                     ` Thomas Wolff
@ 2018-09-05 11:58                                     ` Steven Penny
  2018-09-05 13:18                                       ` Marco Atzeri
  2018-09-05 13:35                                       ` Andrey Repin
  1 sibling, 2 replies; 62+ messages in thread
From: Steven Penny @ 2018-09-05 11:58 UTC (permalink / raw)
  To: cygwin

On Wed, 5 Sep 2018 09:55:28, Corinna Vinschen wrote:
> I added DejaVu Sans Mono per the above and to my surprise I see this:
>
>   $ cat alfa.txt
>   =EF=BF=BD
>
> So it looks like Deja Vu has a 0xfffd char.  However, GetGlyphIndicesW
> claims otherwise:

a character that DejaVu Sans Mono actually doesnt have is:

    U+01C4 LATIN CAPITAL LETTER DZ WITH CARON

Using this file:

    $ cat glyph.c
    #include <stdio.h>
    #include <windows.h>
    int main()
    {
      CONSOLE_FONT_INFOEX ta;
      ta.cbSize = sizeof ta;
      GetCurrentConsoleFontEx(GetStdHandle(STD_OUTPUT_HANDLE), 0, &ta);
      HDC wh = GetDC(0);
      SelectObject(wh,
        CreateFontW(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ta.FaceName));
      WCHAR xr[4] = {0xFFFD, 0x2592, 0x25A1, 0x01C4};
      WORD zu[4];
      GetGlyphIndicesW(wh, xr, 4, zu, 1);
      printf("%ls:\n", ta.FaceName);
      for (int q = 0; q < 4; q++) {
        printf("  U+%04X: %s\n",
        xr[q], zu[q] == 0xffff ? "failure" : "success");
      }
    }

I get this result:

    DejaVu Sans Mono:
      U+FFFD: success
      U+2592: success
      U+25A1: success
      U+01C4: failure


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05 11:58                                     ` Steven Penny
@ 2018-09-05 13:18                                       ` Marco Atzeri
  2018-09-05 15:20                                         ` Andrey Repin
  2018-09-05 15:58                                         ` Corinna Vinschen
  2018-09-05 13:35                                       ` Andrey Repin
  1 sibling, 2 replies; 62+ messages in thread
From: Marco Atzeri @ 2018-09-05 13:18 UTC (permalink / raw)
  To: cygwin

Am 05.09.2018 um 13:58 schrieb Steven Penny:
> On Wed, 5 Sep 2018 09:55:28, Corinna Vinschen wrote:

> Using this file:
> 
>     $ cat glyph.c
>     #include <stdio.h>
>     #include <windows.h>
>     int main()
>     {
>       CONSOLE_FONT_INFOEX ta;
>       ta.cbSize = sizeof ta;
>       GetCurrentConsoleFontEx(GetStdHandle(STD_OUTPUT_HANDLE), 0, &ta);
>       HDC wh = GetDC(0);
>       SelectObject(wh,
>         CreateFontW(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ta.FaceName));
>       WCHAR xr[4] = {0xFFFD, 0x2592, 0x25A1, 0x01C4};
>       WORD zu[4];
>       GetGlyphIndicesW(wh, xr, 4, zu, 1);
>       printf("%ls:\n", ta.FaceName);
>       for (int q = 0; q < 4; q++) {
>         printf("  U+%04X: %s\n",
>         xr[q], zu[q] == 0xffff ? "failure" : "success");
>       }
>     }
> 
> I get this result:
> 
>     DejaVu Sans Mono:
>       U+FFFD: success
>       U+2592: success
>       U+25A1: success
>       U+01C4: failure
> 

Strange on W10 CMD I obtain

DejaVu Sans Mono  U+FFFD: failure
   U+2592: failure
   U+25A1: failure
   U+01C4: failure


Consolas:
   U+FFFD: failure
   U+2592: success
   U+25A1: success
   U+01C4: success

May be original Windows "DejaVu Sans Mono" is incomplete ?



---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05 11:58                                     ` Steven Penny
  2018-09-05 13:18                                       ` Marco Atzeri
@ 2018-09-05 13:35                                       ` Andrey Repin
  2018-09-05 14:04                                         ` Houder
  1 sibling, 1 reply; 62+ messages in thread
From: Andrey Repin @ 2018-09-05 13:35 UTC (permalink / raw)
  To: Steven Penny, cygwin

Greetings, Steven Penny!

> a character that DejaVu Sans Mono actually doesnt have is:

>     U+01C4 LATIN CAPITAL LETTER DZ WITH CARON

> Using this file:

How to compile it?
Simple "gcc glyph.c" fails with

/tmp/ccSCYXAP.o:glyph.c:(.text+0xbd): undefined reference to `__imp_CreateFontW'
/tmp/ccSCYXAP.o:glyph.c:(.text+0xbd): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__imp_CreateFontW'
/tmp/ccSCYXAP.o:glyph.c:(.text+0xd0): undefined reference to `__imp_SelectObject'
/tmp/ccSCYXAP.o:glyph.c:(.text+0xd0): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__imp_SelectObject'
/tmp/ccSCYXAP.o:glyph.c:(.text+0x111): undefined reference to `__imp_GetGlyphIndicesW'
/tmp/ccSCYXAP.o:glyph.c:(.text+0x111): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__imp_GetGlyphIndicesW'
collect2: error: ld returned 1 exit status

Though I see the header files present at their appropriate places.

>     $ cat glyph.c
>     #include <stdio.h>
>     #include <windows.h>
>     int main()
>     {
>       CONSOLE_FONT_INFOEX ta;
>       ta.cbSize = sizeof ta;
>       GetCurrentConsoleFontEx(GetStdHandle(STD_OUTPUT_HANDLE), 0, &ta);
>       HDC wh = GetDC(0);
>       SelectObject(wh,
>         CreateFontW(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ta.FaceName));
>       WCHAR xr[4] = {0xFFFD, 0x2592, 0x25A1, 0x01C4};
>       WORD zu[4];
>       GetGlyphIndicesW(wh, xr, 4, zu, 1);
>       printf("%ls:\n", ta.FaceName);
>       for (int q = 0; q < 4; q++) {
>         printf("  U+%04X: %s\n",
>         xr[q], zu[q] == 0xffff ? "failure" : "success");
>       }
>     }

> I get this result:

>     DejaVu Sans Mono:
>       U+FFFD: success
>       U+2592: success
>       U+25A1: success
>       U+01C4: failure


-- 
With best regards,
Andrey Repin
Wednesday, September 5, 2018 16:30:20

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05 13:35                                       ` Andrey Repin
@ 2018-09-05 14:04                                         ` Houder
  2018-09-05 15:05                                           ` Andrey Repin
  0 siblings, 1 reply; 62+ messages in thread
From: Houder @ 2018-09-05 14:04 UTC (permalink / raw)
  To: cygwin

On Wed, 5 Sep 2018 16:31:33, Andrey Repin wrote:
> Greetings, Steven Penny!
> 
> > a character that DejaVu Sans Mono actually doesnt have is:
> 
> >     U+01C4 LATIN CAPITAL LETTER DZ WITH CARON
> 
> > Using this file:
> 
> How to compile it?
> Simple "gcc glyph.c" fails with
> 
> /tmp/ccSCYXAP.o:glyph.c:(.text+0xbd): undefined reference to `__imp_CreateFontW'
> /tmp/ccSCYXAP.o:glyph.c:(.text+0xbd): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__imp_CreateFontW'
> /tmp/ccSCYXAP.o:glyph.c:(.text+0xd0): undefined reference to `__imp_SelectObject'
> /tmp/ccSCYXAP.o:glyph.c:(.text+0xd0): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__imp_SelectObject'
> /tmp/ccSCYXAP.o:glyph.c:(.text+0x111): undefined reference to `__imp_GetGlyphIndicesW'
> /tmp/ccSCYXAP.o:glyph.c:(.text+0x111): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__imp_GetGlyphIndicesW'
> collect2: error: ld returned 1 exit status

64-@@ gcc -o glyph glyph.c -lgdi32

?

Henri


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05 14:04                                         ` Houder
@ 2018-09-05 15:05                                           ` Andrey Repin
  0 siblings, 0 replies; 62+ messages in thread
From: Andrey Repin @ 2018-09-05 15:05 UTC (permalink / raw)
  To: Houder, cygwin

Greetings, Houder!

>> > a character that DejaVu Sans Mono actually doesnt have is:
>> 
>> >     U+01C4 LATIN CAPITAL LETTER DZ WITH CARON
>> 
>> > Using this file:
>> 
>> How to compile it?
>> Simple "gcc glyph.c" fails with
>> 
>> /tmp/ccSCYXAP.o:glyph.c:(.text+0xbd): undefined reference to `__imp_CreateFontW'
>> /tmp/ccSCYXAP.o:glyph.c:(.text+0xbd): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__imp_CreateFontW'
>> /tmp/ccSCYXAP.o:glyph.c:(.text+0xd0): undefined reference to `__imp_SelectObject'
>> /tmp/ccSCYXAP.o:glyph.c:(.text+0xd0): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__imp_SelectObject'
>> /tmp/ccSCYXAP.o:glyph.c:(.text+0x111): undefined reference to `__imp_GetGlyphIndicesW'
>> /tmp/ccSCYXAP.o:glyph.c:(.text+0x111): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__imp_GetGlyphIndicesW'
>> collect2: error: ld returned 1 exit status

> 64-@@ gcc -o glyph glyph.c -lgdi32

Thanks, that's better. Somewhat.


-- 
With best regards,
Andrey Repin
Wednesday, September 5, 2018 17:52:46

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05 13:18                                       ` Marco Atzeri
@ 2018-09-05 15:20                                         ` Andrey Repin
  2018-09-05 15:58                                         ` Corinna Vinschen
  1 sibling, 0 replies; 62+ messages in thread
From: Andrey Repin @ 2018-09-05 15:20 UTC (permalink / raw)
  To: Marco Atzeri, cygwin

Greetings, Marco Atzeri!

> Strange on W10 CMD I obtain

> DejaVu Sans Mono
>    U+FFFD: failure
>    U+2592: failure
>    U+25A1: failure
>    U+01C4: failure


> Consolas:
>    U+FFFD: failure
>    U+2592: success
>    U+25A1: success
>    U+01C4: success

> May be original Windows "DejaVu Sans Mono" is incomplete ?

Win7 64,

Terminal:
  U+FFFD: failure
  U+2592: failure
  U+25A1: failure
  U+01C4: failure

Consolas:
  U+FFFD: failure
  U+2592: success
  U+25A1: success
  U+01C4: success

Lucida Console:
  U+FFFD: failure
  U+2592: success
  U+25A1: failure
  U+01C4: failure

DejaVu Sans Mono:
  U+FFFD: success
  U+2592: success
  U+25A1: success
  U+01C4: failure

DejaVu Sans Mono 2.33 as released as part of official 2.37 release.


-- 
With best regards,
Andrey Repin
Wednesday, September 5, 2018 17:57:15

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05 13:18                                       ` Marco Atzeri
  2018-09-05 15:20                                         ` Andrey Repin
@ 2018-09-05 15:58                                         ` Corinna Vinschen
  2018-09-05 20:15                                           ` Corinna Vinschen
  1 sibling, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-05 15:58 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1902 bytes --]

On Sep  5 15:18, Marco Atzeri wrote:
> Am 05.09.2018 um 13:58 schrieb Steven Penny:
> > On Wed, 5 Sep 2018 09:55:28, Corinna Vinschen wrote:
> 
> > Using this file:
> > 
> >     $ cat glyph.c
> >     #include <stdio.h>
> >     #include <windows.h>
> >     int main()
> >     {
> >       CONSOLE_FONT_INFOEX ta;
> >       ta.cbSize = sizeof ta;
> >       GetCurrentConsoleFontEx(GetStdHandle(STD_OUTPUT_HANDLE), 0, &ta);
> >       HDC wh = GetDC(0);
> >       SelectObject(wh,
> >         CreateFontW(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ta.FaceName));
> >       WCHAR xr[4] = {0xFFFD, 0x2592, 0x25A1, 0x01C4};
> >       WORD zu[4];
> >       GetGlyphIndicesW(wh, xr, 4, zu, 1);
> >       printf("%ls:\n", ta.FaceName);
> >       for (int q = 0; q < 4; q++) {
> >         printf("  U+%04X: %s\n",
> >         xr[q], zu[q] == 0xffff ? "failure" : "success");
> >       }
> >     }
> > 
> > I get this result:
> > 
> >     DejaVu Sans Mono:
> >       U+FFFD: success
> >       U+2592: success
> >       U+25A1: success
> >       U+01C4: failure
> > 
> 
> Strange on W10 CMD I obtain
> 
> DejaVu Sans Mono  U+FFFD: failure
                 ^^^
You see this?  There's something really fishy here.  I see a similar
effect which somehow depends on arbitrary changes to the source file:

- Sometimes I get "DejaVu Sans Mono" in FaceName and all works well.
- Sometimes I get "DejaVu Sans Mono\1" or "DejaVu Sans Mono\6" and
  the subsequent GetGlyphIndicesW returns failures for many or all
  characters.
  
I'm trying to find what's affecting this for hours, but I don't get any
conclusive results :(


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05 15:58                                         ` Corinna Vinschen
@ 2018-09-05 20:15                                           ` Corinna Vinschen
  2018-09-06  1:35                                             ` Steven Penny
  0 siblings, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-05 20:15 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2256 bytes --]

On Sep  5 17:58, Corinna Vinschen wrote:
> On Sep  5 15:18, Marco Atzeri wrote:
> > Am 05.09.2018 um 13:58 schrieb Steven Penny:
> > > On Wed, 5 Sep 2018 09:55:28, Corinna Vinschen wrote:
> > 
> > > Using this file:
> > > 
> > >     $ cat glyph.c
> > >     #include <stdio.h>
> > >     #include <windows.h>
> > >     int main()
> > >     {
> > >       CONSOLE_FONT_INFOEX ta;
> > >       ta.cbSize = sizeof ta;
> > >       GetCurrentConsoleFontEx(GetStdHandle(STD_OUTPUT_HANDLE), 0, &ta);
> > >       HDC wh = GetDC(0);
> > >       SelectObject(wh,
> > >         CreateFontW(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ta.FaceName));
> > >       WCHAR xr[4] = {0xFFFD, 0x2592, 0x25A1, 0x01C4};
> > >       WORD zu[4];
> > >       GetGlyphIndicesW(wh, xr, 4, zu, 1);
> > >       printf("%ls:\n", ta.FaceName);
> > >       for (int q = 0; q < 4; q++) {
> > >         printf("  U+%04X: %s\n",
> > >         xr[q], zu[q] == 0xffff ? "failure" : "success");
> > >       }
> > >     }
> > > 
> > > I get this result:
> > > 
> > >     DejaVu Sans Mono:
> > >       U+FFFD: success
> > >       U+2592: success
> > >       U+25A1: success
> > >       U+01C4: failure
> > > 
> > 
> > Strange on W10 CMD I obtain
> > 
> > DejaVu Sans Mono  U+FFFD: failure
>                  ^^^
> You see this?  There's something really fishy here.  I see a similar
> effect which somehow depends on arbitrary changes to the source file:
> 
> - Sometimes I get "DejaVu Sans Mono" in FaceName and all works well.
> - Sometimes I get "DejaVu Sans Mono\1" or "DejaVu Sans Mono\6" and
>   the subsequent GetGlyphIndicesW returns failures for many or all
>   characters.
>   
> I'm trying to find what's affecting this for hours, but I don't get any
> conclusive results :(

OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
Liberation Mono and Noto Mono as well and the above problem never occurs
with them.  Weird.  I'm about to let this slip as a font bug.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-05 20:15                                           ` Corinna Vinschen
@ 2018-09-06  1:35                                             ` Steven Penny
  2018-09-06  7:01                                               ` Corinna Vinschen
  0 siblings, 1 reply; 62+ messages in thread
From: Steven Penny @ 2018-09-06  1:35 UTC (permalink / raw)
  To: cygwin

On Wed, 5 Sep 2018 22:14:59, Corinna Vinschen wrote:
> OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
> Liberation Mono and Noto Mono as well and the above problem never occurs
> with them.  Weird.  I'm about to let this slip as a font bug.

as you prob know ive been testing on W7. i found a W10 virtual machine here:

http://developer.microsoft.com/microsoft-edge/tools/vms

but it requires 4GB RAM just for the image. since i only have 4GB total on my
system the image wont load into virtualbox.

i can see about upgrading my system - but i wont bother if you are intent on
wiping your hands of this anyway

let me know - thanks


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-06  1:35                                             ` Steven Penny
@ 2018-09-06  7:01                                               ` Corinna Vinschen
  2018-09-07  8:20                                                 ` Corinna Vinschen
  0 siblings, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-06  7:01 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 964 bytes --]

On Sep  5 18:35, Steven Penny wrote:
> On Wed, 5 Sep 2018 22:14:59, Corinna Vinschen wrote:
> > OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
> > Liberation Mono and Noto Mono as well and the above problem never occurs
> > with them.  Weird.  I'm about to let this slip as a font bug.
> 
> as you prob know ive been testing on W7. i found a W10 virtual machine here:
> 
> http://developer.microsoft.com/microsoft-edge/tools/vms
> 
> but it requires 4GB RAM just for the image. since i only have 4GB total on my
> system the image wont load into virtualbox.
> 
> i can see about upgrading my system - but i wont bother if you are intent on
> wiping your hands of this anyway
> 
> let me know - thanks

https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-06  7:01                                               ` Corinna Vinschen
@ 2018-09-07  8:20                                                 ` Corinna Vinschen
  2018-09-07 10:34                                                   ` Thomas Wolff
  0 siblings, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-07  8:20 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1257 bytes --]

On Sep  6 09:01, Corinna Vinschen wrote:
> On Sep  5 18:35, Steven Penny wrote:
> > On Wed, 5 Sep 2018 22:14:59, Corinna Vinschen wrote:
> > > OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
> > > Liberation Mono and Noto Mono as well and the above problem never occurs
> > > with them.  Weird.  I'm about to let this slip as a font bug.
> > 
> > as you prob know ive been testing on W7. i found a W10 virtual machine here:
> > 
> > http://developer.microsoft.com/microsoft-edge/tools/vms
> > 
> > but it requires 4GB RAM just for the image. since i only have 4GB total on my
> > system the image wont load into virtualbox.
> > 
> > i can see about upgrading my system - but i wont bother if you are intent on
> > wiping your hands of this anyway
> > 
> > let me know - thanks
> 
> https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html

I created new developer snapshots for testing.  Please give the
latest from https://cygwin.com/snapshots/ a try.

This will be my last action for the next 4 weeks though.  I'll
be back in October.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07  8:20                                                 ` Corinna Vinschen
@ 2018-09-07 10:34                                                   ` Thomas Wolff
  2018-09-07 11:29                                                     ` Corinna Vinschen
  0 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-07 10:34 UTC (permalink / raw)
  To: cygwin

On 07.09.2018 10:17, Corinna Vinschen wrote:
> On Sep  6 09:01, Corinna Vinschen wrote:
>> On Sep  5 18:35, Steven Penny wrote:
>>> On Wed, 5 Sep 2018 22:14:59, Corinna Vinschen wrote:
>>>> OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
>>>> Liberation Mono and Noto Mono as well and the above problem never occurs
>>>> with them.  Weird.  I'm about to let this slip as a font bug.
>>> as you prob know ive been testing on W7. i found a W10 virtual machine here:
>>>
>>> http://developer.microsoft.com/microsoft-edge/tools/vms
>>>
>>> but it requires 4GB RAM just for the image. since i only have 4GB total on my
>>> system the image wont load into virtualbox.
>>>
>>> i can see about upgrading my system - but i wont bother if you are intent on
>>> wiping your hands of this anyway
>>>
>>> let me know - thanks
>> https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html
> I created new developer snapshots for testing.  Please give the latest from https://cygwin.com/snapshots/ a try.
Consolas: invalid encoding: hollow box, unknown glyph: boxed question mark
Lucida Console: invalid encoding: medium shade, unknown glyph: hollow box
so far it's fine, but:
Raster Fonts: output of invalid encoding hangs cygwin...

> This will be my last action for the next 4 weeks though.  I'll be back in October.
I'll try to check the code.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07 10:34                                                   ` Thomas Wolff
@ 2018-09-07 11:29                                                     ` Corinna Vinschen
  2018-09-07 11:42                                                       ` Thomas Wolff
  0 siblings, 1 reply; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-07 11:29 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1735 bytes --]

On Sep  7 12:34, Thomas Wolff wrote:
> On 07.09.2018 10:17, Corinna Vinschen wrote:
> > On Sep  6 09:01, Corinna Vinschen wrote:
> > > On Sep  5 18:35, Steven Penny wrote:
> > > > On Wed, 5 Sep 2018 22:14:59, Corinna Vinschen wrote:
> > > > > OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
> > > > > Liberation Mono and Noto Mono as well and the above problem never occurs
> > > > > with them.  Weird.  I'm about to let this slip as a font bug.
> > > > as you prob know ive been testing on W7. i found a W10 virtual machine here:
> > > > 
> > > > http://developer.microsoft.com/microsoft-edge/tools/vms
> > > > 
> > > > but it requires 4GB RAM just for the image. since i only have 4GB total on my
> > > > system the image wont load into virtualbox.
> > > > 
> > > > i can see about upgrading my system - but i wont bother if you are intent on
> > > > wiping your hands of this anyway
> > > > 
> > > > let me know - thanks
> > > https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html
> > I created new developer snapshots for testing.  Please give the latest from https://cygwin.com/snapshots/ a try.
> Consolas: invalid encoding: hollow box, unknown glyph: boxed question mark
> Lucida Console: invalid encoding: medium shade, unknown glyph: hollow box
> so far it's fine, but:
> Raster Fonts: output of invalid encoding hangs cygwin...
> 
> > This will be my last action for the next 4 weeks though.  I'll be back in October.
> I'll try to check the code.

Looks like s/ANSI_CHARSET/DEFAULT_CHARSET/ does the trick


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07 11:29                                                     ` Corinna Vinschen
@ 2018-09-07 11:42                                                       ` Thomas Wolff
  2018-09-07 11:51                                                         ` Thomas Wolff
  0 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-07 11:42 UTC (permalink / raw)
  To: cygwin

On 07.09.2018 13:29, Corinna Vinschen wrote:
> On Sep  7 12:34, Thomas Wolff wrote:
>> On 07.09.2018 10:17, Corinna Vinschen wrote:
>>> On Sep  6 09:01, Corinna Vinschen wrote:
>>>> On Sep  5 18:35, Steven Penny wrote:
>>>>> On Wed, 5 Sep 2018 22:14:59, Corinna Vinschen wrote:
>>>>>> OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
>>>>>> Liberation Mono and Noto Mono as well and the above problem never occurs
>>>>>> with them.  Weird.  I'm about to let this slip as a font bug.
>>>>> as you prob know ive been testing on W7. i found a W10 virtual machine here:
>>>>>
>>>>> http://developer.microsoft.com/microsoft-edge/tools/vms
>>>>>
>>>>> but it requires 4GB RAM just for the image. since i only have 4GB total on my
>>>>> system the image wont load into virtualbox.
>>>>>
>>>>> i can see about upgrading my system - but i wont bother if you are intent on
>>>>> wiping your hands of this anyway
>>>>>
>>>>> let me know - thanks
>>>> https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html
>>> I created new developer snapshots for testing.  Please give the latest from https://cygwin.com/snapshots/ a try.
>> Consolas: invalid encoding: hollow box, unknown glyph: boxed question mark
>> Lucida Console: invalid encoding: medium shade, unknown glyph: hollow box
>> so far it's fine, but:
>> Raster Fonts: output of invalid encoding hangs cygwin...
>>
>>> This will be my last action for the next 4 weeks though.  I'll be back in October.
>> I'll try to check the code.
> Looks like s/ANSI_CHARSET/DEFAULT_CHARSET/ does the trick
Without this change, lf.lfFaceName is "T" when entering the do...while loop.
What's the purpose of this nested loop (do...while and EnumFontFamilies) 
anyway?
Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07 11:42                                                       ` Thomas Wolff
@ 2018-09-07 11:51                                                         ` Thomas Wolff
  2018-09-07 11:54                                                           ` Corinna Vinschen
  0 siblings, 1 reply; 62+ messages in thread
From: Thomas Wolff @ 2018-09-07 11:51 UTC (permalink / raw)
  To: cygwin

On 07.09.2018 13:41, Thomas Wolff wrote:
> On 07.09.2018 13:29, Corinna Vinschen wrote:
>> On Sep  7 12:34, Thomas Wolff wrote:
>>> On 07.09.2018 10:17, Corinna Vinschen wrote:
>>>> On Sep  6 09:01, Corinna Vinschen wrote:
>>>>> On Sep  5 18:35, Steven Penny wrote:
>>>>>> On Wed, 5 Sep 2018 22:14:59, Corinna Vinschen wrote:
>>>>>>> OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
>>>>>>> Liberation Mono and Noto Mono as well and the above problem never occurs
>>>>>>> with them.  Weird.  I'm about to let this slip as a font bug.
>>>>>> as you prob know ive been testing on W7. i found a W10 virtual machine here:
>>>>>>
>>>>>> http://developer.microsoft.com/microsoft-edge/tools/vms
>>>>>>
>>>>>> but it requires 4GB RAM just for the image. since i only have 4GB total on my
>>>>>> system the image wont load into virtualbox.
>>>>>>
>>>>>> i can see about upgrading my system - but i wont bother if you are intent on
>>>>>> wiping your hands of this anyway
>>>>>>
>>>>>> let me know - thanks
>>>>> https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html
>>>> I created new developer snapshots for testing.  Please give the latest fromhttps://cygwin.com/snapshots/  a try.
>>> Consolas: invalid encoding: hollow box, unknown glyph: boxed question mark
>>> Lucida Console: invalid encoding: medium shade, unknown glyph: hollow box
>>> so far it's fine, but:
>>> Raster Fonts: output of invalid encoding hangs cygwin...
>>>
>>>> This will be my last action for the next 4 weeks though.  I'll be back in October.
>>> I'll try to check the code.
>> Looks like s/ANSI_CHARSET/DEFAULT_CHARSET/ does the trick
> Without this change, lf.lfFaceName is "T" when entering the do...while 
> loop.
No, sorry, it's "Terminal" initially and then shortened down to "T" by 
one char each in the loop.
> What's the purpose of this nested loop (do...while and 
> EnumFontFamilies) anyway?


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07 11:51                                                         ` Thomas Wolff
@ 2018-09-07 11:54                                                           ` Corinna Vinschen
  2018-09-07 16:22                                                             ` Brian Inglis
  2018-09-07 16:48                                                             ` Brian Inglis
  0 siblings, 2 replies; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-07 11:54 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2481 bytes --]

On Sep  7 13:51, Thomas Wolff wrote:
> On 07.09.2018 13:41, Thomas Wolff wrote:
> > On 07.09.2018 13:29, Corinna Vinschen wrote:
> > > On Sep  7 12:34, Thomas Wolff wrote:
> > > > On 07.09.2018 10:17, Corinna Vinschen wrote:
> > > > > On Sep  6 09:01, Corinna Vinschen wrote:
> > > > > > On Sep  5 18:35, Steven Penny wrote:
> > > > > > > On Wed, 5 Sep 2018 22:14:59, Corinna Vinschen wrote:
> > > > > > > > OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
> > > > > > > > Liberation Mono and Noto Mono as well and the above problem never occurs
> > > > > > > > with them.  Weird.  I'm about to let this slip as a font bug.
> > > > > > > as you prob know ive been testing on W7. i found a W10 virtual machine here:
> > > > > > > 
> > > > > > > http://developer.microsoft.com/microsoft-edge/tools/vms
> > > > > > > 
> > > > > > > but it requires 4GB RAM just for the image. since i only have 4GB total on my
> > > > > > > system the image wont load into virtualbox.
> > > > > > > 
> > > > > > > i can see about upgrading my system - but i wont bother if you are intent on
> > > > > > > wiping your hands of this anyway
> > > > > > > 
> > > > > > > let me know - thanks
> > > > > > https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html
> > > > > I created new developer snapshots for testing.  Please give the latest fromhttps://cygwin.com/snapshots/  a try.
> > > > Consolas: invalid encoding: hollow box, unknown glyph: boxed question mark
> > > > Lucida Console: invalid encoding: medium shade, unknown glyph: hollow box
> > > > so far it's fine, but:
> > > > Raster Fonts: output of invalid encoding hangs cygwin...
> > > > 
> > > > > This will be my last action for the next 4 weeks though.  I'll be back in October.
> > > > I'll try to check the code.
> > > Looks like s/ANSI_CHARSET/DEFAULT_CHARSET/ does the trick
> > Without this change, lf.lfFaceName is "T" when entering the do...while
> > loop.
> No, sorry, it's "Terminal" initially and then shortened down to "T" by one
> char each in the loop.
> > What's the purpose of this nested loop (do...while and EnumFontFamilies)
> > anyway?

The loop is handling the weird DejaVu Sans Mono behaviour I explained
in previous mail.

I uploaded new snapshots to https://cygwin.de/snapshots/


Enjoy,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07 11:54                                                           ` Corinna Vinschen
@ 2018-09-07 16:22                                                             ` Brian Inglis
  2018-09-07 16:48                                                             ` Brian Inglis
  1 sibling, 0 replies; 62+ messages in thread
From: Brian Inglis @ 2018-09-07 16:22 UTC (permalink / raw)
  To: cygwin

On 2018-09-07 05:54, Corinna Vinschen wrote:
> On Sep  7 13:51, Thomas Wolff wrote:
>> On 07.09.2018 13:41, Thomas Wolff wrote:
>>> On 07.09.2018 13:29, Corinna Vinschen wrote:
>>>> On Sep  7 12:34, Thomas Wolff wrote:
>>>>> On 07.09.2018 10:17, Corinna Vinschen wrote:
>>>>>> On Sep  6 09:01, Corinna Vinschen wrote:
>>>>>>> On Sep  5 18:35, Steven Penny wrote:
>>>>>>>> On Wed, 5 Sep 2018 22:14:59, Corinna Vinschen wrote:
>>>>>>>>> OTOH, in my testing this only occurs for DejaVu Sans Mono.  I installed
>>>>>>>>> Liberation Mono and Noto Mono as well and the above problem never occurs
>>>>>>>>> with them.  Weird.  I'm about to let this slip as a font bug.
>>>>>>>> as you prob know ive been testing on W7. i found a W10 virtual machine here:
>>>>>>>>
>>>>>>>> http://developer.microsoft.com/microsoft-edge/tools/vms
>>>>>>>>
>>>>>>>> but it requires 4GB RAM just for the image. since i only have 4GB total on my
>>>>>>>> system the image wont load into virtualbox.
>>>>>>>>
>>>>>>>> i can see about upgrading my system - but i wont bother if you are intent on
>>>>>>>> wiping your hands of this anyway
>>>>>>>>
>>>>>>>> let me know - thanks
>>>>>>> https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html
>>>>>> I created new developer snapshots for testing.  Please give the latest fromhttps://cygwin.com/snapshots/  a try.
>>>>> Consolas: invalid encoding: hollow box, unknown glyph: boxed question mark
>>>>> Lucida Console: invalid encoding: medium shade, unknown glyph: hollow box
>>>>> so far it's fine, but:
>>>>> Raster Fonts: output of invalid encoding hangs cygwin...
>>>>>
>>>>>> This will be my last action for the next 4 weeks though.  I'll be back in October.
>>>>> I'll try to check the code.
>>>> Looks like s/ANSI_CHARSET/DEFAULT_CHARSET/ does the trick
>>> Without this change, lf.lfFaceName is "T" when entering the do...while
>>> loop.
>> No, sorry, it's "Terminal" initially and then shortened down to "T" by one
>> char each in the loop.
>>> What's the purpose of this nested loop (do...while and EnumFontFamilies)
>>> anyway?
> 
> The loop is handling the weird DejaVu Sans Mono behaviour I explained
> in previous mail.
> 
> I uploaded new snapshots to https://cygwin.de/snapshots/

404?

https://cygwin.de/
"Zur Zeit wird die deutsche Homepage Cygwin.de noch nicht weiter betreut."

Try https://cygwin.com/snapshots/

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07 11:54                                                           ` Corinna Vinschen
  2018-09-07 16:22                                                             ` Brian Inglis
@ 2018-09-07 16:48                                                             ` Brian Inglis
  2018-09-07 17:01                                                               ` Marco Atzeri
  2018-09-07 18:20                                                               ` Corinna Vinschen
  1 sibling, 2 replies; 62+ messages in thread
From: Brian Inglis @ 2018-09-07 16:48 UTC (permalink / raw)
  To: cygwin

On 2018-09-07 05:54, Corinna Vinschen wrote:
> On Sep  7 13:51, Thomas Wolff wrote:
>> On 07.09.2018 13:41, Thomas Wolff wrote:
>>> On 07.09.2018 13:29, Corinna Vinschen wrote:
>>>> On Sep  7 12:34, Thomas Wolff wrote:
>>>>> On 07.09.2018 10:17, Corinna Vinschen wrote:
>>>>>> On Sep  6 09:01, Corinna Vinschen wrote:
>>>>>>> https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html
>>>>>> I created new developer snapshots for testing.  Please give the latest fromhttps://cygwin.com/snapshots/  a try.
>>>>> Raster Fonts: output of invalid encoding hangs cygwin...
>>>>>> This will be my last action for the next 4 weeks though. I'll be
>>>>>> back in October.
>>>> Looks like s/ANSI_CHARSET/DEFAULT_CHARSET/ does the trick
>>> Without this change, lf.lfFaceName is "T" when entering the do...while
>>> loop.
>> No, sorry, it's "Terminal" initially and then shortened down to "T" by one
>> char each in the loop.
>>> What's the purpose of this nested loop (do...while and EnumFontFamilies)
>>> anyway?
> The loop is handling the weird DejaVu Sans Mono behaviour I explained
> in previous mail.

Garbage in font name from uninit struct on stack?
Before call bzero/memset/implicit:

-  CONSOLE_FONT_INFOEX cfi;
+  CONSOLE_FONT_INFOEX cfi = { 0 };

and remove loop, which opens an attack vector by renaming a good font and
substituting one with a shorter name, or could cause problems by using the wrong
font e.g DejaVu Sans.

You need to self-impose a change freeze before heading out, once you're in
"stuff to get done before leaving" mode, which may be a day or up to a week, you
delegate or postpone decisions and actions until you return: BTDTGTS (Got The
Scars) ;^>

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07 16:48                                                             ` Brian Inglis
@ 2018-09-07 17:01                                                               ` Marco Atzeri
  2018-09-07 18:21                                                                 ` Corinna Vinschen
  2018-09-07 18:20                                                               ` Corinna Vinschen
  1 sibling, 1 reply; 62+ messages in thread
From: Marco Atzeri @ 2018-09-07 17:01 UTC (permalink / raw)
  To: cygwin

Am 07.09.2018 um 18:48 schrieb Brian Inglis:
> On 2018-09-07 05:54, Corinna Vinschen wrote:
>> On Sep  7 13:51, Thomas Wolff wrote:
>>> On 07.09.2018 13:41, Thomas Wolff wrote:
>>>> On 07.09.2018 13:29, Corinna Vinschen wrote:
>>>>> On Sep  7 12:34, Thomas Wolff wrote:
>>>>>> On 07.09.2018 10:17, Corinna Vinschen wrote:
>>>>>>> On Sep  6 09:01, Corinna Vinschen wrote:
>
> Garbage in font name from uninit struct on stack?
> Before call bzero/memset/implicit:
>
> -  CONSOLE_FONT_INFOEX cfi;
> +  CONSOLE_FONT_INFOEX cfi = { 0 };
>

This change is effective on the test program on W10

DejaVu Sans Mono:
   U+FFFD: success
   U+2592: success
   U+25A1: success
   U+01C4: failure

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07 16:48                                                             ` Brian Inglis
  2018-09-07 17:01                                                               ` Marco Atzeri
@ 2018-09-07 18:20                                                               ` Corinna Vinschen
  1 sibling, 0 replies; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-07 18:20 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2306 bytes --]

On Sep  7 10:48, Brian Inglis wrote:
> On 2018-09-07 05:54, Corinna Vinschen wrote:
> > On Sep  7 13:51, Thomas Wolff wrote:
> >> On 07.09.2018 13:41, Thomas Wolff wrote:
> >>> On 07.09.2018 13:29, Corinna Vinschen wrote:
> >>>> On Sep  7 12:34, Thomas Wolff wrote:
> >>>>> On 07.09.2018 10:17, Corinna Vinschen wrote:
> >>>>>> On Sep  6 09:01, Corinna Vinschen wrote:
> >>>>>>> https://cygwin.com/ml/cygwin-cvs/2018-q3/msg00054.html
> >>>>>> I created new developer snapshots for testing.  Please give the latest fromhttps://cygwin.com/snapshots/  a try.
> >>>>> Raster Fonts: output of invalid encoding hangs cygwin...
> >>>>>> This will be my last action for the next 4 weeks though. I'll be
> >>>>>> back in October.
> >>>> Looks like s/ANSI_CHARSET/DEFAULT_CHARSET/ does the trick
> >>> Without this change, lf.lfFaceName is "T" when entering the do...while
> >>> loop.
> >> No, sorry, it's "Terminal" initially and then shortened down to "T" by one
> >> char each in the loop.
> >>> What's the purpose of this nested loop (do...while and EnumFontFamilies)
> >>> anyway?
> > The loop is handling the weird DejaVu Sans Mono behaviour I explained
> > in previous mail.
> 
> Garbage in font name from uninit struct on stack?
> Before call bzero/memset/implicit:
> 
> -  CONSOLE_FONT_INFOEX cfi;
> +  CONSOLE_FONT_INFOEX cfi = { 0 };

No, that doesn't help.  Do you really think I didn't try this?  Think
about it.  The string returned by GetCurrentConsoleFontEx is supposed to
be \0-terminated.  If the \0 follows *after* the stray characters, where
would they come from if not created by the GetCurrentConsoleFontEx
function itself?

> and remove loop, which opens an attack vector by renaming a good font
> and substituting one with a shorter name, or could cause problems by
> using the wrong font e.g DejaVu Sans.

How so?

> You need to self-impose a change freeze before heading out, once you're in
> "stuff to get done before leaving" mode, which may be a day or up to a week, you
> delegate or postpone decisions and actions until you return: BTDTGTS (Got The
> Scars) ;^>

I hear you.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-07 17:01                                                               ` Marco Atzeri
@ 2018-09-07 18:21                                                                 ` Corinna Vinschen
  0 siblings, 0 replies; 62+ messages in thread
From: Corinna Vinschen @ 2018-09-07 18:21 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1032 bytes --]

On Sep  7 19:01, Marco Atzeri wrote:
> Am 07.09.2018 um 18:48 schrieb Brian Inglis:
> > On 2018-09-07 05:54, Corinna Vinschen wrote:
> > > On Sep  7 13:51, Thomas Wolff wrote:
> > > > On 07.09.2018 13:41, Thomas Wolff wrote:
> > > > > On 07.09.2018 13:29, Corinna Vinschen wrote:
> > > > > > On Sep  7 12:34, Thomas Wolff wrote:
> > > > > > > On 07.09.2018 10:17, Corinna Vinschen wrote:
> > > > > > > > On Sep  6 09:01, Corinna Vinschen wrote:
> > 
> > Garbage in font name from uninit struct on stack?
> > Before call bzero/memset/implicit:
> > 
> > -  CONSOLE_FONT_INFOEX cfi;
> > +  CONSOLE_FONT_INFOEX cfi = { 0 };
> > 
> 
> This change is effective on the test program on W10

Yes, but it doesn't actually help.  Change something, *anything* in the
test application.  Or build it with optimization vs. without.  The
stray chars will return.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Cygwin fails to utilize Unicode replacement character
  2018-09-03 22:15                             ` Steven Penny
  2018-09-04  6:06                               ` Brian Inglis
  2018-09-04  9:00                               ` Corinna Vinschen
@ 2018-10-04  0:25                               ` Steven Penny
  2 siblings, 0 replies; 62+ messages in thread
From: Steven Penny @ 2018-10-04  0:25 UTC (permalink / raw)
  To: cygwin

On Mon, 03 Sep 2018 15:15:26, Steven Penny wrote:
> Expanding on the "Notepad" example, "Notepad" default font is "Lucida
> Console", which doesnt have U+FFFD either. However pasting into "Notepad" will
> still show U+FFFD properly because "Tahoma" has U+FFFD and "Notepad" can
> utilize composite font, while it appears "cmd.exe" and similar cannot.

http://cygwin.com/ml/cygwin/2018-09/msg00060.html

I should correct myself. "cmd.exe" can do this, its called Font Linking:

http://docs.microsoft.com/globalization/input/font-technology

it allows you to pick a "base font", for example "Consolas". Then you can link
another font that has U+FFFD, like "Tahoma". Ideally both fonts would be
monospace, but it seems Windows has no builtin monospace font with U+FFFD.

Then when a missing character is encountered, it will pull from the linked font
if possible.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2018-10-04  0:25 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-01 16:13 Cygwin fails to utilize Unicode replacement character Steven Penny
2018-09-01 18:11 ` Thomas Wolff
2018-09-01 18:46   ` Steven Penny
2018-09-01 21:07     ` Thomas Wolff
2018-09-01 19:40 ` Corinna Vinschen
2018-09-01 21:50 ` Doug Henderson
2018-09-01 22:49   ` Steven Penny
2018-09-02  8:07     ` Thomas Wolff
2018-09-02 12:51       ` Steven Penny
2018-09-03 12:46         ` Corinna Vinschen
2018-09-03 14:59           ` Corinna Vinschen
2018-09-03 16:34             ` Thomas Wolff
2018-09-03 17:17               ` Corinna Vinschen
2018-09-03 17:56                 ` Thomas Wolff
2018-09-03 18:20                   ` Thomas Wolff
2018-09-03 19:14                     ` Corinna Vinschen
2018-09-03 20:27                       ` Corinna Vinschen
2018-09-03 20:42                         ` Thomas Wolff
2018-09-03 21:03                           ` Corinna Vinschen
2018-09-03 22:15                             ` Steven Penny
2018-09-04  6:06                               ` Brian Inglis
2018-09-04  9:00                               ` Corinna Vinschen
2018-09-04 11:40                                 ` Steven Penny
2018-09-05  7:55                                   ` Corinna Vinschen
2018-09-05  9:22                                     ` Thomas Wolff
2018-09-05 11:58                                     ` Steven Penny
2018-09-05 13:18                                       ` Marco Atzeri
2018-09-05 15:20                                         ` Andrey Repin
2018-09-05 15:58                                         ` Corinna Vinschen
2018-09-05 20:15                                           ` Corinna Vinschen
2018-09-06  1:35                                             ` Steven Penny
2018-09-06  7:01                                               ` Corinna Vinschen
2018-09-07  8:20                                                 ` Corinna Vinschen
2018-09-07 10:34                                                   ` Thomas Wolff
2018-09-07 11:29                                                     ` Corinna Vinschen
2018-09-07 11:42                                                       ` Thomas Wolff
2018-09-07 11:51                                                         ` Thomas Wolff
2018-09-07 11:54                                                           ` Corinna Vinschen
2018-09-07 16:22                                                             ` Brian Inglis
2018-09-07 16:48                                                             ` Brian Inglis
2018-09-07 17:01                                                               ` Marco Atzeri
2018-09-07 18:21                                                                 ` Corinna Vinschen
2018-09-07 18:20                                                               ` Corinna Vinschen
2018-09-05 13:35                                       ` Andrey Repin
2018-09-05 14:04                                         ` Houder
2018-09-05 15:05                                           ` Andrey Repin
2018-09-04 12:50                                 ` David Macek
2018-09-04 14:18                                   ` Thomas Wolff
2018-09-04 14:46                                     ` David Macek
2018-09-04 18:20                                     ` Steven Penny
2018-09-04 18:41                                       ` Thomas Wolff
2018-09-04 19:50                                         ` Andrey Repin
2018-09-04 19:53                                         ` Steven Penny
2018-09-04 21:43                                           ` Thomas Wolff
2018-09-04 23:29                                             ` Steven Penny
2018-09-04 20:40                                       ` Brian Inglis
2018-09-05  8:32                                         ` Corinna Vinschen
2018-09-04 13:05                                 ` Andrey Repin
2018-10-04  0:25                               ` Steven Penny
2018-09-03 16:05         ` Brian Inglis
2018-09-04 19:59 ` Doug Henderson
2018-09-04 21:05   ` Steven Penny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).