* Re: [newlib-cygwin] Cygwin: Set threadnames with SetThreadDescription() [not found] ` <78af80e5-baed-5ebb-314f-99d13f2a25ca@sourceware.org> @ 2022-07-29 18:28 ` Corinna Vinschen 2022-07-31 15:03 ` Jon Turney 0 siblings, 1 reply; 2+ messages in thread From: Corinna Vinschen @ 2022-07-29 18:28 UTC (permalink / raw) To: Jon Turney; +Cc: Cygwin Patches On Jul 29 15:14, Jon Turney wrote: > On 29/07/2022 12:58, Corinna Vinschen wrote: > > Hi Jon, > > > > On Jul 29 11:01, Jon TURNEY via Cygwin-cvs wrote: > > > https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;h=d4689b99c68628d9ec2fc1ac7884906ddbf6a2fc > > > > > > commit d4689b99c68628d9ec2fc1ac7884906ddbf6a2fc > > > Author: Jon Turney <jon.turney@dronecode.org.uk> > > > Date: Thu May 19 17:27:39 2022 +0100 > > > > > > Cygwin: Set threadnames with SetThreadDescription() > > > [...] > > > + /* SetThreadDescription only exists in a wide-char version, so we must > > > + convert threadname to wide-char. The encoding of threadName is > > > + unclear, so use UTF8 until we know better. */ > > > + int bufsize = MultiByteToWideChar (CP_UTF8, 0, threadName, -1, NULL, 0); > > > + WCHAR buf[bufsize]; > > > + bufsize = MultiByteToWideChar (CP_UTF8, 0, threadName, -1, buf, bufsize); > > > > I think this is wrong. The function should use stock mbstowcs instead > > to get the externally used encoding. Think of SetThreadName called with > > program_invocation_short_name in pthread::thread_init_wrapper, or called > > from pthread_setname_np with an externally provided thread name. This > > thread name will use the locale of the application code it's called by. > > I'm not sure. > > The linux manpage for pthread_setname_np() says "The thread name is a > meaningful C language string", which I think means it's ASCII-encoded, not > locale-encoded. I think this only means, it's a NUL-terminated string. "Meaningful" is just trying to nudge developers into using meaningful names, not something like "blurb". > (The solaris manpage explicitly says that the thread name is utf8 encoded) Ok, that's an interesting point. > The encoding for program_invocation_short_name was also unclear to me. > (It's the same as argv[0], so I guess it's in whatever encoding the > filesystem uses, which doesn't have to match the process locale encoding) > > Expecting this function to work with non-ASCII names seems optimistic :) Well, for Linux it's certainly just an arbitrary, NUL-terminated byte stream, but yeah, it's certainly the only portable way to expect the portable codeset. Anyway, feel free to just keep the code as is. We're typically using UTF-8 anyway and people switching to one of the legacy codesets are supposed to know what they are doing. Corinna ^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [newlib-cygwin] Cygwin: Set threadnames with SetThreadDescription() 2022-07-29 18:28 ` [newlib-cygwin] Cygwin: Set threadnames with SetThreadDescription() Corinna Vinschen @ 2022-07-31 15:03 ` Jon Turney 0 siblings, 0 replies; 2+ messages in thread From: Jon Turney @ 2022-07-31 15:03 UTC (permalink / raw) To: Corinna Vinschen, Cygwin Patches On 29/07/2022 19:28, Corinna Vinschen wrote: > On Jul 29 15:14, Jon Turney wrote: >> On 29/07/2022 12:58, Corinna Vinschen wrote: >>> Hi Jon, >>> >>> On Jul 29 11:01, Jon TURNEY via Cygwin-cvs wrote: >>>> https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;h=d4689b99c68628d9ec2fc1ac7884906ddbf6a2fc >>>> >>>> commit d4689b99c68628d9ec2fc1ac7884906ddbf6a2fc >>>> Author: Jon Turney <jon.turney@dronecode.org.uk> >>>> Date: Thu May 19 17:27:39 2022 +0100 >>>> >>>> Cygwin: Set threadnames with SetThreadDescription() >>>> [...] >>>> + /* SetThreadDescription only exists in a wide-char version, so we must >>>> + convert threadname to wide-char. The encoding of threadName is >>>> + unclear, so use UTF8 until we know better. */ >>>> + int bufsize = MultiByteToWideChar (CP_UTF8, 0, threadName, -1, NULL, 0); >>>> + WCHAR buf[bufsize]; >>>> + bufsize = MultiByteToWideChar (CP_UTF8, 0, threadName, -1, buf, bufsize); >>> >>> I think this is wrong. The function should use stock mbstowcs instead >>> to get the externally used encoding. Think of SetThreadName called with >>> program_invocation_short_name in pthread::thread_init_wrapper, or called >>> from pthread_setname_np with an externally provided thread name. This >>> thread name will use the locale of the application code it's called by. >> >> I'm not sure. >> >> The linux manpage for pthread_setname_np() says "The thread name is a >> meaningful C language string", which I think means it's ASCII-encoded, not >> locale-encoded. > > I think this only means, it's a NUL-terminated string. "Meaningful" is > just trying to nudge developers into using meaningful names, not > something like "blurb". Oh yeah, that reading makes more sense! Still I think the threadname is just really just an opaque NULL terminated byte sequence which you can get back with pthread_getname_np(). If there are other mechanisms which make that threadname available to other processes (which might have a different locale), it's unclear how the encoding is supposed to be handled... >> (The solaris manpage explicitly says that the thread name is utf8 encoded) > > Ok, that's an interesting point. > >> The encoding for program_invocation_short_name was also unclear to me. >> (It's the same as argv[0], so I guess it's in whatever encoding the >> filesystem uses, which doesn't have to match the process locale encoding) >> >> Expecting this function to work with non-ASCII names seems optimistic :) > > Well, for Linux it's certainly just an arbitrary, NUL-terminated byte > stream, but yeah, it's certainly the only portable way to expect > the portable codeset. > > Anyway, feel free to just keep the code as is. We're typically using > UTF-8 anyway and people switching to one of the legacy codesets are > supposed to know what they are doing. Yes, I think I'll leave this as is until someone complains! :) ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-07-31 15:03 UTC | newest] Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20220729110147.4E6F43858424@sourceware.org> [not found] ` <YuPLd2hlbaNwxAJ0@calimero.vinschen.de> [not found] ` <78af80e5-baed-5ebb-314f-99d13f2a25ca@sourceware.org> 2022-07-29 18:28 ` [newlib-cygwin] Cygwin: Set threadnames with SetThreadDescription() Corinna Vinschen 2022-07-31 15:03 ` Jon Turney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).