* [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts @ 2024-01-24 12:25 Richard W.M. Jones 2024-01-24 12:25 ` Richard W.M. Jones 0 siblings, 1 reply; 8+ messages in thread From: Richard W.M. Jones @ 2024-01-24 12:25 UTC (permalink / raw) To: binutils; +Cc: rjones This second version of the patch uses the recommended way to find host endianness. Note that you have to rerun 'autoreconf' with the correct version of autoconf (2.69) before this actually works. I didn't include that in the patch since the change is all in generated code. (I can send the patch if you really want. Also, why autoconf 2.69 exactly? That makes it much harder to build and test this on modern systems.) Rich. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts 2024-01-24 12:25 [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts Richard W.M. Jones @ 2024-01-24 12:25 ` Richard W.M. Jones 2024-01-25 15:18 ` Nick Clifton 0 siblings, 1 reply; 8+ messages in thread From: Richard W.M. Jones @ 2024-01-24 12:25 UTC (permalink / raw) To: binutils; +Cc: rjones On big endian hosts (eg. s390x) the windmc tool fails to parse even trivial files: $ cat test.mc ; $ ./binutils/windmc ./test.mc In test.mc at line 1: parser: syntax error. In test.mc at line 1: fatal: syntax error. The tool starts by reading the input as Windows CP1252 and then converting it internally into an array of UTF-16LE, which it then processes as an array of unsigned short (typedef unichar). There are lots of ways this is wrong, but in the specific case of big endian machines the little endian pairs of bytes are byte-swapped. For example, the ';' character in the input above is first converted to UTF16-LE byte sequence { 0x3b, 0x00 }, which is then cast to unsigned short. On a big endian machine the first unichar appears to be 0x3b00. The lexer is unable to recognize this as the comment character ((unichar)';') and so parsing fails. The simple fix is to convert the input to UTF-16BE on big endian machines (and do the reverse conversion when writing the output). Fixes: https://sourceware.org/bugzilla/show_bug.cgi?id=31283 Signed-off-by: Richard W.M. Jones <rjones@redhat.com> --- binutils/configure.ac | 2 ++ binutils/winduni.c | 16 ++++++++++++++-- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/binutils/configure.ac b/binutils/configure.ac index b03e36c9e0e..dac72c1bdd4 100644 --- a/binutils/configure.ac +++ b/binutils/configure.ac @@ -31,6 +31,8 @@ AC_PROG_CC AC_GNU_SOURCE AC_USE_SYSTEM_EXTENSIONS +AC_C_BIGENDIAN + LT_INIT ACX_LARGEFILE diff --git a/binutils/winduni.c b/binutils/winduni.c index 5b659764948..f19de4f8cb3 100644 --- a/binutils/winduni.c +++ b/binutils/winduni.c @@ -771,7 +771,13 @@ wind_MultiByteToWideChar (rc_uint_type cp, const char *mb, if (!mb || !iconv_name) return 0; - iconv_t cd = iconv_open ("UTF-16LE", iconv_name); + iconv_t cd = iconv_open ( +#if WORDS_BIGENDIAN + "UTF-16BE", +#else + "UTF-16LE", +#endif + iconv_name); while (1) { @@ -844,7 +850,13 @@ wind_WideCharToMultiByte (rc_uint_type cp, const unichar *u, char *mb, rc_uint_t if (!u || !iconv_name) return 0; - iconv_t cd = iconv_open (iconv_name, "UTF-16LE"); + iconv_t cd = iconv_open (iconv_name, +#if WORDS_BIGENDIAN + "UTF-16BE" +#else + "UTF-16LE" +#endif + ); while (1) { -- 2.39.3 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts 2024-01-24 12:25 ` Richard W.M. Jones @ 2024-01-25 15:18 ` Nick Clifton 2024-01-25 15:52 ` Richard W.M. Jones 0 siblings, 1 reply; 8+ messages in thread From: Nick Clifton @ 2024-01-25 15:18 UTC (permalink / raw) To: Richard W.M. Jones, binutils Hi Richard, > The simple fix is to convert the input to UTF-16BE on big endian > machines (and do the reverse conversion when writing the output). Approved - please apply (mainline and if you wish, the 2.42 branch as well). Cheers Nick ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts 2024-01-25 15:18 ` Nick Clifton @ 2024-01-25 15:52 ` Richard W.M. Jones 2024-01-26 10:08 ` Nick Clifton 0 siblings, 1 reply; 8+ messages in thread From: Richard W.M. Jones @ 2024-01-25 15:52 UTC (permalink / raw) To: Nick Clifton; +Cc: binutils On Thu, Jan 25, 2024 at 03:18:23PM +0000, Nick Clifton wrote: > Hi Richard, > > >The simple fix is to convert the input to UTF-16BE on big endian > >machines (and do the reverse conversion when writing the output). > > Approved - please apply (mainline and if you wish, the 2.42 branch as well). Thanks - I guess I'm not allowed to push the patch though? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts 2024-01-25 15:52 ` Richard W.M. Jones @ 2024-01-26 10:08 ` Nick Clifton 2024-01-26 10:12 ` Richard W.M. Jones 0 siblings, 1 reply; 8+ messages in thread From: Nick Clifton @ 2024-01-26 10:08 UTC (permalink / raw) To: Richard W.M. Jones; +Cc: binutils Hi Rich, >>> The simple fix is to convert the input to UTF-16BE on big endian >>> machines (and do the reverse conversion when writing the output). >> >> Approved - please apply (mainline and if you wish, the 2.42 branch as well). > > Thanks - I guess I'm not allowed to push the patch though? No you are allowed to push it. Sorry for the confusion. By "apply" I actually meant "commit and push". Cheers Nick ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts 2024-01-26 10:08 ` Nick Clifton @ 2024-01-26 10:12 ` Richard W.M. Jones 2024-01-26 11:51 ` Nick Clifton 0 siblings, 1 reply; 8+ messages in thread From: Richard W.M. Jones @ 2024-01-26 10:12 UTC (permalink / raw) To: Nick Clifton; +Cc: binutils On Fri, Jan 26, 2024 at 10:08:32AM +0000, Nick Clifton wrote: > Hi Rich, > > >>>The simple fix is to convert the input to UTF-16BE on big endian > >>>machines (and do the reverse conversion when writing the output). > >> > >>Approved - please apply (mainline and if you wish, the 2.42 branch as well). > > > >Thanks - I guess I'm not allowed to push the patch though? > > No you are allowed to push it. > > Sorry for the confusion. By "apply" I actually meant "commit and push". Thanks. Note that someone will need to rerun autoconf for this change to actually have an effect. I didn't include that in the patch as the change is rather large and all in generated code. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com nbdkit - Flexible, fast NBD server with plugins https://gitlab.com/nbdkit/nbdkit ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts 2024-01-26 10:12 ` Richard W.M. Jones @ 2024-01-26 11:51 ` Nick Clifton 2024-02-08 9:50 ` Alan Modra 0 siblings, 1 reply; 8+ messages in thread From: Nick Clifton @ 2024-01-26 11:51 UTC (permalink / raw) To: Richard W.M. Jones; +Cc: binutils Hi Rich. > Thanks. Note that someone will need to rerun autoconf for this change > to actually have an effect. I didn't include that in the patch as the > change is rather large and all in generated code. Thanks for doing that - having uncluttered patched to review is much appreciated. Rerunning autoconf got the 2.42 branch will not be a problem - I will be doing that as part of the release creation process which should be happening on Monday. Rerunning it on the mainline should also not be a big issue, as long as I notice your patch going in. If I forget, please could you ping me ? Cheers Nick ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts 2024-01-26 11:51 ` Nick Clifton @ 2024-02-08 9:50 ` Alan Modra 0 siblings, 0 replies; 8+ messages in thread From: Alan Modra @ 2024-02-08 9:50 UTC (permalink / raw) To: Nick Clifton; +Cc: Richard W.M. Jones, binutils On Fri, Jan 26, 2024 at 11:51:07AM +0000, Nick Clifton wrote: > > Thanks. Note that someone will need to rerun autoconf for this change > > to actually have an effect. I didn't include that in the patch as the > > change is rather large and all in generated code. > > Thanks for doing that - having uncluttered patched to review is much appreciated. > > Rerunning autoconf on the 2.42 branch will not be a problem - I will be > doing that as part of the release creation process which should be happening > on Monday. Rerunning it on the mainline should also not be a big issue, > as long as I notice your patch going in. If I forget, please could you ping > me ? I pushed the patch for Richard. -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-02-08 9:50 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-01-24 12:25 [PATCH v2] binutils/windmc: Parse input correctly on big endian hosts Richard W.M. Jones 2024-01-24 12:25 ` Richard W.M. Jones 2024-01-25 15:18 ` Nick Clifton 2024-01-25 15:52 ` Richard W.M. Jones 2024-01-26 10:08 ` Nick Clifton 2024-01-26 10:12 ` Richard W.M. Jones 2024-01-26 11:51 ` Nick Clifton 2024-02-08 9:50 ` Alan Modra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).