在 2021/1/14 上午9:47, Roy Qu via Gcc 写道: > I use "gcc -finput-charset=utf-8 -fexec-charset=gb2312" to compile utf-8 > encoding source files under windows. Most of the time it works well, but > when the source file contains some characters such as "—", gcc will fail > and the error message is: "[Error] converting to execution character set: > Illegal byte sequence". > > The attached file is an example. I have tested the file by using iconv to > convert it from utf-8 to gbk, and iconv works with no complaints. > It looks like this is a bug in iconv. Converting the attached source with `iconv -f utf-8 -t gb2312 testencoding.cpp` gives the same error. According to the GB2312 code table [1], the EM DASH symbol (U+2014) should map to the double-byte sequence `A1 AA`. There is no difference among GB2312, GBK and GB18030. Please consider GB2312 superseded by GBK. The native code page (936) references GBK instead of GB2312. [1] http://www.khngai.com/chinese/charmap/ > So maybe there's something wrong when gcc is trying to do the encoding > conversion? > > Some information: > Toolchain: MinGW-W64-i686, gcc 10.2 > System: Windows 10 Simplified Chinese Home edition ver 2004 > -- Best regards, LH_Mouse