public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Costas Argyris <costas.argyris@gmail.com>
Cc: binutils@sourceware.org
Subject: Re: Windres with .rc file that references utf8 code page manifest doesn't seem to work
Date: Fri, 28 Apr 2023 15:15:48 +0200	[thread overview]
Message-ID: <a32297e1-8ac8-9092-e497-a8b97c5fd6e1@suse.com> (raw)
In-Reply-To: <CAHyHGCmbYATbVYzgn3-00r1B0xhh2yLkntOgDs=EMq_mzA0XFQ@mail.gmail.com>

On 01.03.2023 14:36, Costas Argyris via Binutils wrote:
> Hi
> 
> I am trying to embed a .manifest file as a resource into a windows
> executable with windres, but it doesn't seem to work.    Here are the steps
> I am following:
> 
> 1) Resource .rc file:
> 
> C:\Users\cargyris\temp>cat utf8.rc
> 1 24 "utf8.manifest"
> 
> These are the correct defines for embedding a manifest into an executable.
> 
> 2) It references the .manifest file:
> 
> C:\Users\cargyris\temp>cat utf8.manifest
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1">
>   <application>
>     <windowsSettings>
>       <activeCodePage xmlns="
> http://schemas.microsoft.com/SMI/2019/WindowsSettings
> ">UTF-8</activeCodePage>
>     </windowsSettings>
>   </application>
> </assembly>
> 
> This was taken from the official ms docs:
> 
> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
> 
> 3) Compile the .rc file with windres into an object COFF file:
> 
> C:\Users\cargyris\temp>windres utf8.rc utf8.o
> 
> 4) Compile a dummy .c file with gcc (mingw-w64):
> 
> C:\Users\cargyris\temp>gcc -g -O0 -c src.c
> 
> This creates src.o in the temp dir
> 
> 5) Link the two object files together:
> 
> C:\Users\cargyris\temp>gcc src.o utf8.o
> 
> This creates the executable a.exe
> 
> 
> 
> Now the problem is that this executable should be working fully in UTF-8
> since we embedded the appropriate manifest into it, but it's not.    To
> prove this, pass to it an argument that has a Unicode character and inspect
> how that argument gets picked up by the main function using gdb:
> 
> C:\Users\cargyris\temp>gdb --args a.exe ﹏\src.c
> ...
> (gdb) break main
> Breakpoint 1 at 0x401564: file src.c, line 5.
> (gdb) run
> Starting program: C:\Users\cargyris\temp\a.exe "?\src.c"
> [New Thread 3692.0x1e44]
> [New Thread 3692.0x20bc]
> 
> Thread 1 hit Breakpoint 1, main (argc=2, argv=0x741800) at src.c:5
> 5           printf("sks\n");
> (gdb) x/4xb argv[1]
> 0x741770:       0x3f    0x5c    0x73    0x72
> (gdb) print argv[1]
> $1 = 0x741770 "?\\src.c"
> (gdb)
> 
> 
> The Unicode character never made it to main.    It was this character:
> 
> https://www.compart.com/en/unicode/U+FE4F
> 
> which has a 3-byte long UTF-8 representation:
> 
> 0xEF 0xB9 0x8F
> 
> so I was expecting to see those 3 bytes in argv[1] in the main function.
> Instead, main got the question mark and backslash characters.    It doesn't
> help to switch the Command Prompt code page to UTF-8 (65001) with chcp
> 65001 just before starting gdb, still the same behavior occurs.
> 
> 
> 
> The .manifest file I used was taken from the official docs and actually
> works when embedding it with mt.exe as described there - the program
> properly takes a UTF-8 char array.

Have you inspected the resulting binaries (or perhaps their dumps via
whatever suitable inspection tool)? There ought to be some difference, and
perhaps knowing what's different is more helpful here than observations
from running the program under gdb. May also be worthwhile to check whether
mt.exe would actually successfully extract the manifest from the windres-
produced .exe.

> I was under the impression that embedding the manifest with windres should
> have the same effect as embedding it with ms tools,

Hmm, here you say "embedding", whereas above you said you link the COFF
object. These are entirely different steps, so what about using windres on
the .exe instead (which as per the doc ought to be possible)?

Jan

> that is, the final
> executable should be working fully in UTF-8.    This doesn't seem to be
> happening though.    Is that a problem with windres, the link phase,
> something I am doing wrong, or is it just not possible due to ms-specifics
> that are not captured by GNU tools?
> 
> Thanks,
> Costas


  reply	other threads:[~2023-04-28 13:15 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-01 13:36 Costas Argyris
2023-04-28 13:15 ` Jan Beulich [this message]
2023-04-28 13:29   ` Costas Argyris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a32297e1-8ac8-9092-e497-a8b97c5fd6e1@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=costas.argyris@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).