public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/111244] New: std::filesystem::path encoding mismatches locale on Windows
@ 2023-08-30 19:10 thiago at kde dot org
  2023-08-30 19:25 ` [Bug libstdc++/111244] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: thiago at kde dot org @ 2023-08-30 19:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111244

            Bug ID: 111244
           Summary: std::filesystem::path encoding mismatches locale on
                    Windows
           Product: gcc
           Version: 13.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: thiago at kde dot org
  Target Milestone: ---

Test:
$ cat fstest.cpp 
#include <filesystem>
#include <stdio.h>

int main(int argc, char **argv)
{
    for (int i = 1; i < argc; ++i) {
        std::filesystem::path p(argv[i]);
        if (std::filesystem::exists(p)) {
            printf("%s %llu\n", argv[1], (unsigned long
long)std::filesystem::file_size(p));
        } else {
            printf("%s does not exist\n", argv[1]);
        }
    }
}
$ touch filæ
$ g++ fstest.cpp
$ ./a.out fstest.cpp filæ

On Linux (and any other Unix):
fstest.cpp 377
fstest.cpp 0

On Windows with libc++ or MS STL:
fstest.cpp 377
fstest.cpp 0

On Windows with libstdc++:
fstest.cpp 377
terminate called after throwing an instance of
'std::filesystem::__cxx11::filesystem_error'
  what():  filesystem error: Cannot convert character sequence: Illegal byte
sequence

This is caused by std::filesystem::path interpreting the input as UTF-8. On
Windows, it's not; it must be decoded using the locale codec. 

Strictly speaking, the same should apply to the conversion to Unicode on Unix
systems too, but a) they're almost all UTF-8 these days, so the corner cases
may be ignored by a policy decision and b) the mismatch of input does not lead
to inability to refer to files by fs::path alone.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-30 20:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-30 19:10 [Bug c++/111244] New: std::filesystem::path encoding mismatches locale on Windows thiago at kde dot org
2023-08-30 19:25 ` [Bug libstdc++/111244] " pinskia at gcc dot gnu.org
2023-08-30 19:31 ` thiago at kde dot org
2023-08-30 19:59 ` redi at gcc dot gnu.org
2023-08-30 20:03 ` costas.argyris at gmail dot com
2023-08-30 20:15 ` thiago at kde dot org
2023-08-30 20:50 ` costas.argyris at gmail dot com
2023-08-30 20:56 ` thiago at kde dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).