From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id DB2523858D3C; Tue, 28 Feb 2023 22:07:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DB2523858D3C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1677622064; bh=MxKtkN4jfy6TzucDZq6G81uOyNfmknUyh6ZKsNB1J5w=; h=From:To:Subject:Date:In-Reply-To:References:From; b=CZsxHrMyRMkNniaO9Wi3n71EQJC5sAkDsBMiqv66qEOsshH425unxoPBST8PeYmCN zjUNMI6zAE86BZn8bA2tu7gY+K3tLBLFC/aQweZTLIP2G0NP5TU8yutLHlgGsYwPIE BdzcjhHmt4smaX0Vx2K0CULKt6d+OSSbjS8VZOE8= From: "costas.argyris at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug driver/108865] gcc on Windows fails with Unicode path to source file Date: Tue, 28 Feb 2023 22:07:44 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: driver X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: costas.argyris at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108865 --- Comment #10 from Costas Argyris --- The only interesting bit I found there was the shell script that gets called before actually running windres: https://github.com/jbruchon/jdupes/blob/master/Makefile#L201 which is doing some setup: https://github.com/jbruchon/jdupes/blob/master/tune_winres.sh It is changing some version information, but I don't think this is relevant= to the problem I am having. Actually to be more precise, what I did did not fail completely, which makes this even stranger: I have a custom tool I created a while ago that you = pass it the path to a Windows executable and it tells you the active code page i= t is using, and this tool actually reports the correct UTF-8 code page when I use the patch I posted. So it looks like it worked at first, but the argumen= ts passed to the executable are still destroyed before main has a chance to do anything with them. It is like the executable itself is successfully converted to use UTF-8, but the setup done by the OS before reaching the entry point (main) hasn't been done properly, so the args never reach main properly. I suspect this is = the part that the ms tools do that we don't. It makes some sense because on this particular problem, it is the arguments passed to the program that matter as well, not only the program itself.=20= =20=20 Perhaps the ms tools do some more work on the executable (besides just link= ing in the manifest) that signify to the OS loader that the args passed to it m= ust also be interpreted as UTF-8. If such a thing is happening, our linking = of the object resource file would never accomplish that I think. On another note, that program doesn't need to use the UTF-8 manifest because apparently it is using the wmain approach to get UTF-16 wide strings and converts them to char-based UTF-8, which wasn't a very good solution for gcc due to impact on the rest of the programs it spawns: https://github.com/jbruchon/jdupes/blob/master/jody_win_unicode.c=