public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Windres with .rc file that references utf8 code page manifest doesn't seem to work
@ 2023-03-01 13:36 Costas Argyris
  2023-04-28 13:15 ` Jan Beulich
  0 siblings, 1 reply; 3+ messages in thread
From: Costas Argyris @ 2023-03-01 13:36 UTC (permalink / raw)
  To: binutils

[-- Attachment #1: Type: text/plain, Size: 3109 bytes --]

Hi

I am trying to embed a .manifest file as a resource into a windows
executable with windres, but it doesn't seem to work.    Here are the steps
I am following:

1) Resource .rc file:

C:\Users\cargyris\temp>cat utf8.rc
1 24 "utf8.manifest"

These are the correct defines for embedding a manifest into an executable.

2) It references the .manifest file:

C:\Users\cargyris\temp>cat utf8.manifest
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1">
  <application>
    <windowsSettings>
      <activeCodePage xmlns="
http://schemas.microsoft.com/SMI/2019/WindowsSettings
">UTF-8</activeCodePage>
    </windowsSettings>
  </application>
</assembly>

This was taken from the official ms docs:

https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

3) Compile the .rc file with windres into an object COFF file:

C:\Users\cargyris\temp>windres utf8.rc utf8.o

4) Compile a dummy .c file with gcc (mingw-w64):

C:\Users\cargyris\temp>gcc -g -O0 -c src.c

This creates src.o in the temp dir

5) Link the two object files together:

C:\Users\cargyris\temp>gcc src.o utf8.o

This creates the executable a.exe



Now the problem is that this executable should be working fully in UTF-8
since we embedded the appropriate manifest into it, but it's not.    To
prove this, pass to it an argument that has a Unicode character and inspect
how that argument gets picked up by the main function using gdb:

C:\Users\cargyris\temp>gdb --args a.exe ﹏\src.c
...
(gdb) break main
Breakpoint 1 at 0x401564: file src.c, line 5.
(gdb) run
Starting program: C:\Users\cargyris\temp\a.exe "?\src.c"
[New Thread 3692.0x1e44]
[New Thread 3692.0x20bc]

Thread 1 hit Breakpoint 1, main (argc=2, argv=0x741800) at src.c:5
5           printf("sks\n");
(gdb) x/4xb argv[1]
0x741770:       0x3f    0x5c    0x73    0x72
(gdb) print argv[1]
$1 = 0x741770 "?\\src.c"
(gdb)


The Unicode character never made it to main.    It was this character:

https://www.compart.com/en/unicode/U+FE4F

which has a 3-byte long UTF-8 representation:

0xEF 0xB9 0x8F

so I was expecting to see those 3 bytes in argv[1] in the main function.
Instead, main got the question mark and backslash characters.    It doesn't
help to switch the Command Prompt code page to UTF-8 (65001) with chcp
65001 just before starting gdb, still the same behavior occurs.



The .manifest file I used was taken from the official docs and actually
works when embedding it with mt.exe as described there - the program
properly takes a UTF-8 char array.

I was under the impression that embedding the manifest with windres should
have the same effect as embedding it with ms tools, that is, the final
executable should be working fully in UTF-8.    This doesn't seem to be
happening though.    Is that a problem with windres, the link phase,
something I am doing wrong, or is it just not possible due to ms-specifics
that are not captured by GNU tools?

Thanks,
Costas

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Windres with .rc file that references utf8 code page manifest doesn't seem to work
  2023-03-01 13:36 Windres with .rc file that references utf8 code page manifest doesn't seem to work Costas Argyris
@ 2023-04-28 13:15 ` Jan Beulich
  2023-04-28 13:29   ` Costas Argyris
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Beulich @ 2023-04-28 13:15 UTC (permalink / raw)
  To: Costas Argyris; +Cc: binutils

On 01.03.2023 14:36, Costas Argyris via Binutils wrote:
> Hi
> 
> I am trying to embed a .manifest file as a resource into a windows
> executable with windres, but it doesn't seem to work.    Here are the steps
> I am following:
> 
> 1) Resource .rc file:
> 
> C:\Users\cargyris\temp>cat utf8.rc
> 1 24 "utf8.manifest"
> 
> These are the correct defines for embedding a manifest into an executable.
> 
> 2) It references the .manifest file:
> 
> C:\Users\cargyris\temp>cat utf8.manifest
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1">
>   <application>
>     <windowsSettings>
>       <activeCodePage xmlns="
> http://schemas.microsoft.com/SMI/2019/WindowsSettings
> ">UTF-8</activeCodePage>
>     </windowsSettings>
>   </application>
> </assembly>
> 
> This was taken from the official ms docs:
> 
> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
> 
> 3) Compile the .rc file with windres into an object COFF file:
> 
> C:\Users\cargyris\temp>windres utf8.rc utf8.o
> 
> 4) Compile a dummy .c file with gcc (mingw-w64):
> 
> C:\Users\cargyris\temp>gcc -g -O0 -c src.c
> 
> This creates src.o in the temp dir
> 
> 5) Link the two object files together:
> 
> C:\Users\cargyris\temp>gcc src.o utf8.o
> 
> This creates the executable a.exe
> 
> 
> 
> Now the problem is that this executable should be working fully in UTF-8
> since we embedded the appropriate manifest into it, but it's not.    To
> prove this, pass to it an argument that has a Unicode character and inspect
> how that argument gets picked up by the main function using gdb:
> 
> C:\Users\cargyris\temp>gdb --args a.exe ﹏\src.c
> ...
> (gdb) break main
> Breakpoint 1 at 0x401564: file src.c, line 5.
> (gdb) run
> Starting program: C:\Users\cargyris\temp\a.exe "?\src.c"
> [New Thread 3692.0x1e44]
> [New Thread 3692.0x20bc]
> 
> Thread 1 hit Breakpoint 1, main (argc=2, argv=0x741800) at src.c:5
> 5           printf("sks\n");
> (gdb) x/4xb argv[1]
> 0x741770:       0x3f    0x5c    0x73    0x72
> (gdb) print argv[1]
> $1 = 0x741770 "?\\src.c"
> (gdb)
> 
> 
> The Unicode character never made it to main.    It was this character:
> 
> https://www.compart.com/en/unicode/U+FE4F
> 
> which has a 3-byte long UTF-8 representation:
> 
> 0xEF 0xB9 0x8F
> 
> so I was expecting to see those 3 bytes in argv[1] in the main function.
> Instead, main got the question mark and backslash characters.    It doesn't
> help to switch the Command Prompt code page to UTF-8 (65001) with chcp
> 65001 just before starting gdb, still the same behavior occurs.
> 
> 
> 
> The .manifest file I used was taken from the official docs and actually
> works when embedding it with mt.exe as described there - the program
> properly takes a UTF-8 char array.

Have you inspected the resulting binaries (or perhaps their dumps via
whatever suitable inspection tool)? There ought to be some difference, and
perhaps knowing what's different is more helpful here than observations
from running the program under gdb. May also be worthwhile to check whether
mt.exe would actually successfully extract the manifest from the windres-
produced .exe.

> I was under the impression that embedding the manifest with windres should
> have the same effect as embedding it with ms tools,

Hmm, here you say "embedding", whereas above you said you link the COFF
object. These are entirely different steps, so what about using windres on
the .exe instead (which as per the doc ought to be possible)?

Jan

> that is, the final
> executable should be working fully in UTF-8.    This doesn't seem to be
> happening though.    Is that a problem with windres, the link phase,
> something I am doing wrong, or is it just not possible due to ms-specifics
> that are not captured by GNU tools?
> 
> Thanks,
> Costas


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Windres with .rc file that references utf8 code page manifest doesn't seem to work
  2023-04-28 13:15 ` Jan Beulich
@ 2023-04-28 13:29   ` Costas Argyris
  0 siblings, 0 replies; 3+ messages in thread
From: Costas Argyris @ 2023-04-28 13:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

[-- Attachment #1: Type: text/plain, Size: 4535 bytes --]

Please disregard my previous email.

Windres is working just fine - running the program under gdb to test the
result
wasn't a very good idea.

I just forgot to send another followup email to say there is no real
problem - sorry about that.

Costas

On Fri, 28 Apr 2023 at 14:15, Jan Beulich <jbeulich@suse.com> wrote:

> On 01.03.2023 14:36, Costas Argyris via Binutils wrote:
> > Hi
> >
> > I am trying to embed a .manifest file as a resource into a windows
> > executable with windres, but it doesn't seem to work.    Here are the
> steps
> > I am following:
> >
> > 1) Resource .rc file:
> >
> > C:\Users\cargyris\temp>cat utf8.rc
> > 1 24 "utf8.manifest"
> >
> > These are the correct defines for embedding a manifest into an
> executable.
> >
> > 2) It references the .manifest file:
> >
> > C:\Users\cargyris\temp>cat utf8.manifest
> > <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> > <assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1">
> >   <application>
> >     <windowsSettings>
> >       <activeCodePage xmlns="
> > http://schemas.microsoft.com/SMI/2019/WindowsSettings
> > ">UTF-8</activeCodePage>
> >     </windowsSettings>
> >   </application>
> > </assembly>
> >
> > This was taken from the official ms docs:
> >
> >
> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
> >
> > 3) Compile the .rc file with windres into an object COFF file:
> >
> > C:\Users\cargyris\temp>windres utf8.rc utf8.o
> >
> > 4) Compile a dummy .c file with gcc (mingw-w64):
> >
> > C:\Users\cargyris\temp>gcc -g -O0 -c src.c
> >
> > This creates src.o in the temp dir
> >
> > 5) Link the two object files together:
> >
> > C:\Users\cargyris\temp>gcc src.o utf8.o
> >
> > This creates the executable a.exe
> >
> >
> >
> > Now the problem is that this executable should be working fully in UTF-8
> > since we embedded the appropriate manifest into it, but it's not.    To
> > prove this, pass to it an argument that has a Unicode character and
> inspect
> > how that argument gets picked up by the main function using gdb:
> >
> > C:\Users\cargyris\temp>gdb --args a.exe ﹏\src.c
> > ...
> > (gdb) break main
> > Breakpoint 1 at 0x401564: file src.c, line 5.
> > (gdb) run
> > Starting program: C:\Users\cargyris\temp\a.exe "?\src.c"
> > [New Thread 3692.0x1e44]
> > [New Thread 3692.0x20bc]
> >
> > Thread 1 hit Breakpoint 1, main (argc=2, argv=0x741800) at src.c:5
> > 5           printf("sks\n");
> > (gdb) x/4xb argv[1]
> > 0x741770:       0x3f    0x5c    0x73    0x72
> > (gdb) print argv[1]
> > $1 = 0x741770 "?\\src.c"
> > (gdb)
> >
> >
> > The Unicode character never made it to main.    It was this character:
> >
> > https://www.compart.com/en/unicode/U+FE4F
> >
> > which has a 3-byte long UTF-8 representation:
> >
> > 0xEF 0xB9 0x8F
> >
> > so I was expecting to see those 3 bytes in argv[1] in the main function.
> > Instead, main got the question mark and backslash characters.    It
> doesn't
> > help to switch the Command Prompt code page to UTF-8 (65001) with chcp
> > 65001 just before starting gdb, still the same behavior occurs.
> >
> >
> >
> > The .manifest file I used was taken from the official docs and actually
> > works when embedding it with mt.exe as described there - the program
> > properly takes a UTF-8 char array.
>
> Have you inspected the resulting binaries (or perhaps their dumps via
> whatever suitable inspection tool)? There ought to be some difference, and
> perhaps knowing what's different is more helpful here than observations
> from running the program under gdb. May also be worthwhile to check whether
> mt.exe would actually successfully extract the manifest from the windres-
> produced .exe.
>
> > I was under the impression that embedding the manifest with windres
> should
> > have the same effect as embedding it with ms tools,
>
> Hmm, here you say "embedding", whereas above you said you link the COFF
> object. These are entirely different steps, so what about using windres on
> the .exe instead (which as per the doc ought to be possible)?
>
> Jan
>
> > that is, the final
> > executable should be working fully in UTF-8.    This doesn't seem to be
> > happening though.    Is that a problem with windres, the link phase,
> > something I am doing wrong, or is it just not possible due to
> ms-specifics
> > that are not captured by GNU tools?
> >
> > Thanks,
> > Costas
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-04-28 13:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-01 13:36 Windres with .rc file that references utf8 code page manifest doesn't seem to work Costas Argyris
2023-04-28 13:15 ` Jan Beulich
2023-04-28 13:29   ` Costas Argyris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).