public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "costas.argyris at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug driver/108865] New: gcc on Windows fails with Unicode path to source file
Date: Mon, 20 Feb 2023 21:11:20 +0000	[thread overview]
Message-ID: <bug-108865-4@http.gcc.gnu.org/bugzilla/> (raw)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108865

            Bug ID: 108865
           Summary: gcc on Windows fails with Unicode path to source file
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: driver
          Assignee: unassigned at gcc dot gnu.org
          Reporter: costas.argyris at gmail dot com
  Target Milestone: ---

From Windows Command Prompt, temp has a subfolder named ﹏
(https://www.compart.com/en/unicode/U+FE4F) with a source file in it.    Try to
compile it:

C:\Users\cargyris\temp>gcc ﹏\src.c
gcc: error: ?\src.c: Invalid argument
gcc: fatal error: no input files
compilation terminated.

Note how ﹏ was destroyed into ?

The problem starts all the way from gcc's main function at gcc-main.cc

The main function is the normal one which takes char *argv[], that is, it takes
its command-line arguments as char-based strings.    On Windows, this means
that the arguments will be interpreted using the local Windows ANSI codepage,
and, as a result, the ﹏ character gets destroyed right from the start - gcc
never sees it correctly.

The way to see the Unicode args properly would be to use wmain instead of main,
which takes wchar_t *argv[] and uses UTF-16.

Would it ever be considered to change main to wmain when compiling for Windows
+ mingw-w64 in order to achieve support for Unicode paths on Windows?

There is also another solution outside of gcc:    Changing the ANSI code page
to UTF-8.    This can be done either on a global system level (for the whole
Windows OS) or on a per-process level, specifically targeting gcc to use UTF-8
as it's ANSI code page.    These approaches require user intervention though,
whereas if the Unicode main was used (wmain) things would just work with
Unicode paths without the user having to do anything special.

             reply	other threads:[~2023-02-20 21:11 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-20 21:11 costas.argyris at gmail dot com [this message]
2023-02-20 21:23 ` [Bug driver/108865] " pinskia at gcc dot gnu.org
2023-02-20 21:39 ` costas.argyris at gmail dot com
2023-02-20 21:41 ` pinskia at gcc dot gnu.org
2023-02-25 14:15 ` costas.argyris at gmail dot com
2023-02-25 18:27 ` pinskia at gcc dot gnu.org
2023-02-28 21:01 ` costas.argyris at gmail dot com
2023-02-28 21:11 ` costas.argyris at gmail dot com
2023-02-28 21:14 ` pinskia at gcc dot gnu.org
2023-02-28 21:27 ` pinskia at gcc dot gnu.org
2023-02-28 22:07 ` costas.argyris at gmail dot com
2023-03-01 10:38 ` costas.argyris at gmail dot com
2023-03-01 17:08 ` costas.argyris at gmail dot com
2023-03-02  0:57 ` costas.argyris at gmail dot com
2023-03-02  1:11 ` pinskia at gcc dot gnu.org
2023-03-02  9:56 ` costas.argyris at gmail dot com
2023-03-02 23:50 ` pinskia at gcc dot gnu.org
2023-03-05 23:42 ` costas.argyris at gmail dot com
2023-03-05 23:50 ` pinskia at gcc dot gnu.org
2023-03-06 12:25 ` costas.argyris at gmail dot com
2023-03-09 15:00 ` cvs-commit at gcc dot gnu.org
2023-03-09 16:22 ` pinskia at gcc dot gnu.org
2023-03-22 10:00 ` lh_mouse at 126 dot com
2023-03-22 10:35 ` costas.argyris at gmail dot com
2023-03-22 15:37 ` lh_mouse at 126 dot com
2023-03-23  1:13 ` costas.argyris at gmail dot com
2023-03-23  4:48 ` lh_mouse at 126 dot com
2023-03-23  9:24 ` costas.argyris at gmail dot com
2023-03-23  9:53 ` costas.argyris at gmail dot com
2023-03-24 12:00 ` costas.argyris at gmail dot com
2023-03-24 13:33 ` rguenth at gcc dot gnu.org
2023-03-29 11:58 ` costas.argyris at gmail dot com
2023-03-29 12:01 ` costas.argyris at gmail dot com
2023-03-29 12:20 ` costas.argyris at gmail dot com
2023-11-15  0:05 ` peter0x44 at disroot dot org
2023-11-15  0:10 ` pinskia at gcc dot gnu.org
2023-11-15  0:13 ` pinskia at gcc dot gnu.org
2023-11-15  0:15 ` peter0x44 at disroot dot org
2023-11-15  1:31 ` lh_mouse at 126 dot com
2023-11-15  9:23 ` ebotcazou at gcc dot gnu.org
2023-11-15 13:16 ` costas.argyris at gmail dot com
2023-11-15 13:57 ` ebotcazou at gcc dot gnu.org
2023-11-16 13:41 ` costas.argyris at gmail dot com
2023-11-16 14:57 ` ebotcazou at gcc dot gnu.org
2023-11-18  2:41 ` aoliva at gcc dot gnu.org
2023-11-20 18:35 ` costas.argyris at gmail dot com
2023-11-23  0:49 ` cvs-commit at gcc dot gnu.org
2023-11-29 10:56 ` cvs-commit at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-108865-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).