public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Costas Argyris <costas.argyris@gmail.com>
To: Jacek Caban <jacek@codeweavers.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: Enable UTF-8 code page in driver and compiler on 64-bit mingw host [PR108865]
Date: Wed, 8 Mar 2023 10:52:05 +0000	[thread overview]
Message-ID: <CAHyHGCm2CHwjdNx1H5SOgm2iYB1ah-wLi0UkDGKOJLep+x+2dg@mail.gmail.com> (raw)
In-Reply-To: <CAHyHGC=dzErh_2Xnn-t2MsCqUUF5K5XT-zUXaRCD_mOy3v_02Q@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3368 bytes --]

Added .manifest file to the make rule for utf8rc-mingw32.o, latest patch
attached.

On Tue, 7 Mar 2023 at 15:27, Costas Argyris <costas.argyris@gmail.com>
wrote:

> Hi Jacek,
>
> "but I think it should work just fine if you didn't explicitly limit the
> patch to x86_64."
>
> I would think so too.
>
> Actually, even cygwin might benefit from this, assuming it has the same
> problem, which I don't know if it's the case.
>
> But I'm not experienced with that so I would like to explore these hosts
> separately and just focus on the most common 64-bit Windows host with this
> change, if possible.
>
> "The point that when winnt-utf8.manifest is modified, utf8-mingw32.o
> should be rebuilt."
>
> Right, makes sense.
>
> Just noting that winnt-utf8.manifest is really not meant to be modified,
> because it is copied straight from:
>
>
> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
>
> and will probably remain like that, but I do get your point and I am happy
> to make the change.
>
> Thanks,
> Costas
>
> On Tue, 7 Mar 2023 at 14:18, Jacek Caban <jacek@codeweavers.com> wrote:
>
>> Hi Costas,
>>
>> On 3/7/23 15:00, Costas Argyris wrote:
>> > Hi Jacek,
>> >
>> > "Is there a reason to make it specific to x86_64? It seems to me that
>> > all mingw hosts could use it."
>> >
>> > Are you referring to the 32-bit host?    My concern here is that this
>> > functionality (embedding the UTF-8
>> > manifest file into the executable) is only truly supported in recent
>> > versions of Windows.    From:
>> >
>> >
>> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
>> >
>> > It says that Windows Version 1903 (May 2019 Update) enables this, so
>> > we are looking at the 64-bit
>> > version of Windows.
>> >
>> > I suppose you are referring to the scenario where one has a 32-bit
>> > gcc + mingw running in a 64-bit
>> > Windows that is recent enough to support this?    It is not clear to
>> > me based on the above doc what
>> > would happen encoding-wise in that situation, and I haven't tried it
>> > either because I assumed that
>> > most people would want the 64-bit version of gcc since they are
>> > probably running a 64-bit OS.
>> >
>> > If you think it is useful, I could look into that as a separate task
>> > to try and keep this one simple, if
>> > that makes sense.
>>
>>
>> Yes, realistically it's mostly about 32-bit gcc on 64-bit Windows
>> (perhaps aarch64 as well at some point in the future). It's probably
>> indeed not very popular configuration those days, but I think it should
>> work just fine if you didn't explicitly limit the patch to x86_64.
>>
>>
>> > "I think that .manifest file should also be a dependency here."
>> >
>> > Why is that?    Windres takes only the .rc file as its input, as per
>> > its own doc, and it successfully
>> > compiles it into an object file.    The .manifest file is only
>> > referenced by the .rc file, and it doesn't
>> > get passed to windres, so I don't see why it has to be listed as a
>> > prerequisite in the make rule.
>>
>>
>> The point that when winnt-utf8.manifest is modified, utf8-mingw32.o
>> should be rebuilt. Anyway, it's probably not a big deal (I should
>> disclaim that I'm not very familiar with gcc build system; I'm mostly on
>> this ML due to mingw-w64 contributions).
>>
>>
>> Thanks,
>>
>> Jacek
>>
>>

[-- Attachment #2: 0001-Enable-UTF-8-code-page-on-Windows-64-bit-host-PR1088.patch --]
[-- Type: text/x-patch, Size: 6518 bytes --]

From 694d6f4860a08f690070df411f3f72d66a48a981 Mon Sep 17 00:00:00 2001
From: Costas Argyris <costas.argyris@gmail.com>
Date: Tue, 28 Feb 2023 17:10:18 +0000
Subject: [PATCH] Enable UTF-8 code page on Windows 64-bit host [PR108865]

Compile a resource object that contains the utf8 manifest.

Then link that object into the driver and compiler proper.

For compiler proper the link has to be forced because the
resource object file gets into a static library (libbackend.a)
and gets eventually dropped because it has no symbols of
its own and nothing is referencing it inside the library.

Therefore, an artificial symbol is planted to force the link.
---
 gcc/config.host                     |  5 ++-
 gcc/config/i386/sym-mingw32.cc      |  1 +
 gcc/config/i386/utf8-mingw32.rc     |  3 ++
 gcc/config/i386/winnt-utf8.manifest |  8 ++++
 gcc/config/i386/x-mingw32           |  3 +-
 gcc/config/i386/x-mingw32-utf8      | 57 +++++++++++++++++++++++++++++
 6 files changed, 73 insertions(+), 4 deletions(-)
 create mode 100644 gcc/config/i386/sym-mingw32.cc
 create mode 100644 gcc/config/i386/utf8-mingw32.rc
 create mode 100644 gcc/config/i386/winnt-utf8.manifest
 create mode 100644 gcc/config/i386/x-mingw32-utf8

diff --git a/gcc/config.host b/gcc/config.host
index a522c39658e..4abb32ad73d 100644
--- a/gcc/config.host
+++ b/gcc/config.host
@@ -241,10 +241,11 @@ case ${host} in
   x86_64-*-mingw*)
     use_long_long_for_widest_fast_int=yes
     host_xm_file=i386/xm-mingw32.h
-    host_xmake_file="${host_xmake_file} i386/x-mingw32"
+    host_xmake_file="${host_xmake_file} i386/x-mingw32 i386/x-mingw32-utf8"
     host_exeext=.exe
     out_host_hook_obj=host-mingw32.o
-    host_extra_gcc_objs="${host_extra_gcc_objs} driver-mingw32.o"
+    host_extra_objs="${host_extra_objs} utf8-mingw32.o"
+    host_extra_gcc_objs="${host_extra_gcc_objs} driver-mingw32.o utf8rc-mingw32.o"
     host_lto_plugin_soname=liblto_plugin.dll
     ;;
   aarch64*-*-darwin*)
diff --git a/gcc/config/i386/sym-mingw32.cc b/gcc/config/i386/sym-mingw32.cc
new file mode 100644
index 00000000000..f369698abc4
--- /dev/null
+++ b/gcc/config/i386/sym-mingw32.cc
@@ -0,0 +1 @@
+char HOST_EXTRA_OBJS_SYMBOL;
diff --git a/gcc/config/i386/utf8-mingw32.rc b/gcc/config/i386/utf8-mingw32.rc
new file mode 100644
index 00000000000..e2174e85b7c
--- /dev/null
+++ b/gcc/config/i386/utf8-mingw32.rc
@@ -0,0 +1,3 @@
+#include <winuser.h>
+
+CREATEPROCESS_MANIFEST_RESOURCE_ID RT_MANIFEST "winnt-utf8.manifest"
diff --git a/gcc/config/i386/winnt-utf8.manifest b/gcc/config/i386/winnt-utf8.manifest
new file mode 100644
index 00000000000..dab929e1515
--- /dev/null
+++ b/gcc/config/i386/winnt-utf8.manifest
@@ -0,0 +1,8 @@
+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+<assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1">
+  <application>
+    <windowsSettings>
+      <activeCodePage xmlns="http://schemas.microsoft.com/SMI/2019/WindowsSettings">UTF-8</activeCodePage>
+    </windowsSettings>
+  </application>
+</assembly>
diff --git a/gcc/config/i386/x-mingw32 b/gcc/config/i386/x-mingw32
index 5b8b5f96143..cb3d8434881 100644
--- a/gcc/config/i386/x-mingw32
+++ b/gcc/config/i386/x-mingw32
@@ -27,8 +27,7 @@ WERROR_FLAGS += -Wno-format
 
 host-mingw32.o : $(srcdir)/config/i386/host-mingw32.cc $(CONFIG_H) $(SYSTEM_H) \
   coretypes.h hosthooks.h hosthooks-def.h toplev.h $(DIAGNOSTIC_H) $(HOOKS_H)
-	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
-		$(srcdir)/config/i386/host-mingw32.cc
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $<
 
 driver-mingw32.o : $(srcdir)/config/i386/driver-mingw32.cc $(CONFIG_H)
 	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $<
diff --git a/gcc/config/i386/x-mingw32-utf8 b/gcc/config/i386/x-mingw32-utf8
new file mode 100644
index 00000000000..efeeeff4996
--- /dev/null
+++ b/gcc/config/i386/x-mingw32-utf8
@@ -0,0 +1,57 @@
+# Copyright (C) 2023 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+#
+#
+# For 64-bit Windows host, embed a manifest that sets the active
+# code page of the driver and compiler proper processes to utf8.
+# This only has an effect on Windows version 1903 (May 2019 Update)
+# or later.
+
+# The resource .rc file references the utf8 .manifest file.
+# Compile it into an object file using windres.
+# The resulting .o file gets added to host_extra_gcc_objs in
+# config.host for x86_64-*-mingw* host and gets linked into
+# the driver as a .o file, so it's lack of symbols is OK.
+utf8rc-mingw32.o : $(srcdir)/config/i386/utf8-mingw32.rc \
+  $(srcdir)/config/i386/winnt-utf8.manifest
+	$(WINDRES) $< $@
+
+# Create an object file that just exports the global symbol
+# HOST_EXTRA_OBJS_SYMBOL
+sym-mingw32.o : $(srcdir)/config/i386/sym-mingw32.cc
+	$(COMPILER) -c $< $@
+
+# Combine the two object files into one which has both the
+# compiled utf8 resource and the HOST_EXTRA_OBJS_SYMBOL symbol.
+# The resulting .o file gets added to host_extra_objs in
+# config.host for x86_64-*-mingw* host and gets archived into
+# libbackend.a which gets linked into the compiler proper.
+# If nothing references it into libbackend.a, it will not
+# get linked into the compiler proper eventually.
+# Therefore we need to request the symbol at compiler link time.
+utf8-mingw32.o : utf8rc-mingw32.o sym-mingw32.o
+	$(COMPILER) -r utf8rc-mingw32.o sym-mingw32.o -o $@
+
+# Force compilers to link against the utf8 resource by
+# requiring the symbol to be defined.
+# Otherwise the object file won't get linked in the compilers
+# because nothing is referencing it in libbackend.a
+# This is expected because the resource object is not supposed
+# to have any symbols, it just has to be linked into the
+# executable in order for Windows to use the utf8 code page.
+$(COMPILERS) : override LDFLAGS += -Wl,--require-defined=HOST_EXTRA_OBJS_SYMBOL
-- 
2.30.2


  reply	other threads:[~2023-03-08 10:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-07  0:52 Costas Argyris
2023-03-07 12:01 ` Jacek Caban
2023-03-07 14:00   ` Costas Argyris
2023-03-07 14:17     ` Jacek Caban
2023-03-07 15:27       ` Costas Argyris
2023-03-08 10:52         ` Costas Argyris [this message]
2023-03-09 13:33           ` Costas Argyris
2023-03-09 15:03             ` Jonathan Yong
2023-03-27 17:17               ` Costas Argyris
2023-03-28  8:05                 ` Jonathan Yong
2023-03-28 10:43                   ` Costas Argyris
2023-03-28 12:03                     ` Jonathan Yong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHyHGCm2CHwjdNx1H5SOgm2iYB1ah-wLi0UkDGKOJLep+x+2dg@mail.gmail.com \
    --to=costas.argyris@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jacek@codeweavers.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).