From: Martin Sebor <msebor@gmail.com>
To: papa@arbolone.ca, Riot <rain.backnet@gmail.com>,
mingw-w64-public@lists.sourceforge.net
Cc: gcc-help Mailing List <gcc-help@gcc.gnu.org>
Subject: Re: [Mingw-w64-public] toUpper()
Date: Wed, 01 Jul 2015 15:18:00 -0000 [thread overview]
Message-ID: <559404A3.30207@gmail.com> (raw)
In-Reply-To: <309695B4BE814C0887F18FB11845328E@ArbolOneLT>
[-- Attachment #1: Type: text/plain, Size: 4523 bytes --]
On 07/01/2015 06:02 AM, papa@arbolone.ca wrote:
> std::wstring source(L"Hello World");
> std::wstring destination;
> destination.resize(source.size());
> std::transform (source.begin(), source.end(), destination.begin(),
> (int(*)(int))std::toupper);
>
> The above code is what did the trick, do not ask how, I am still
> digesting it. However, any suggestions would be very much appreciated
This solved problem (1) below but doesn't work correctly or
portably because of the second problem I described in my first
response. std::toupper(int) is defined for narrow characters in
the range [0, UCHAR_MAX] plus EOF. The function has undefined
behavior for characters outside that range (i.e., all wchar_t
greater than UCHAR_MAX).
I don't know what will happen on Windows(*) but on Linux, I can
see the program doesn't work correctly for the Latin Extended
Additional block of characters (the first one I noticed). For
instance, running the attached modified version of the program
in a UTF-8 locale such as en_US.utf8 to convert U+1EBD (LATIN
SMALL LETTER E WITH TILDE) to its uppercase form (U+1EBC)
prints:
U+1EBD U+1EBC U+1EBD
when the expected output is:
U+1EBD U+1EBC U+1EBC
If you want to use transform with wide characters, you need
to use towupper (declared in <wctype.h>).
Martin
[*] I vaguely recall toupper and friends aborting on Windows
when passed an out-of-range argument but I'm not 100% sure.
>
> -----Original Message----- From: Martin Sebor
> Sent: Tuesday, June 30, 2015 10:01 PM
> To: Riot ; mingw-w64-public@lists.sourceforge.net
> Cc: gcc-help Mailing List
> Subject: Re: [Mingw-w64-public] toUpper()
>
> On 06/30/2015 05:24 PM, Riot wrote:
>> #include <algorithm>
>> #include <string>
>>
>> std::string str = "Hello World";
>> std::transform(str.begin(), str.end(), str.begin(), std::toupper);
>
> Please note this code is subtly incorrect for two reasons.
> There are two overloads of std::toupper:
>
> 1) int toupper(int) declared in <ctype.h> (and the equivalent
> std::toupper in <cctype>)
> 2) template <class T> charT std::toupper(T, const locale&)
> in <locale>
>
> Without the right #include directive, the above may or may
> not resolve to "the right" function (which depends on what
> declarations the two headers bring into scope).
>
> When it resolves to (2) it will fail to compile.
>
> When it resolves to (1), it will do the wrong thing (have
> undefined behavior) at runtime when char is a signed type
> and the argument is negative (because (1) is only defined
> for values between -1 and UCHAR_MAX).
>
> But the question is about converting std::wstring to upper
> case and the above uses a narrow string. For wstring, the
> std::ctype<wchar_t>::toupper() function or its convenience
> non-member template function can be used.
>
>> See also: http://www.cplusplus.com/reference/locale/toupper/
>
> This is one possible way to do it. Another approach is along
> these lines:
>
> std::locale loc (...);
> std::wstring wstr = L"...";
> const std::ctype<wchar_t> &ct =
> std::use_facet<std::ctype<wchar_t> >(loc);
> ct.toupper (&wstr[0], &wstr[0] + wstr.size());
>
> Martin
>
>>
>> This may also help in future: http://lmgtfy.com/?q=c%2B%2B+toupper
>>
>> -Riot
>>
>> On 30 June 2015 at 23:58, <papa@arbolone.ca> wrote:
>>> I would like to write a function to capitalize letters, say...
>>> std::wstring toUpper(const std::wstring wstr){
>>> for ( auto it = wstr.begin(); it != wstr.end(); ++it){
>>> global_wapstr.append(std::towupper(&it));
>>>
>>> }
>>> }
>>>
>>> This doesnÂ’t work, but doesnÂ’t the standard already have something like
>>> std::wstring::toUpper(...)?
>>>
>>> Thanks in advance
>>>
>>>
>>> ---
>>> This email has been checked for viruses by Avast antivirus software.
>>> http://www.avast.com
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> Don't Limit Your Business. Reach for the Cloud.
>>> GigeNET's Cloud Solutions provide you with the tools and support that
>>> you need to offload your IT needs and focus on growing your business.
>>> Configured For All Businesses. Start Your Cloud Today.
>>> https://www.gigenetcloud.com/
>>> _______________________________________________
>>> Mingw-w64-public mailing list
>>> Mingw-w64-public@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> http://www.avast.com
>
[-- Attachment #2: u.cpp --]
[-- Type: text/x-c++src, Size: 608 bytes --]
#include <algorithm>
#include <cctype>
#include <cwctype>
#include <clocale>
#include <stdio.h>
#include <string>
int main ()
{
if (!std::setlocale (LC_ALL, ""))
return 1;
// convert LATIN SMALL LETTER E WITH TILDE (U+1EBD)
// to LATIN CAPITAL LETTER E WITH TILDE (U+1EBC)
std::wstring source(L"\x1ebd");
std::wstring destination;
destination.resize(source.size());
std::transform (source.begin(), source.end(), destination.begin(), (int(*)(int))std::toupper);
printf ("U+%04X U+%04X U+%04X\n",
source [0], towupper (source [0]), destination [0]);
}
next prev parent reply other threads:[~2015-07-01 15:18 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-30 22:58 toUpper() papa
2015-06-30 23:24 ` [Mingw-w64-public] toUpper() Riot
2015-07-01 2:01 ` Martin Sebor
2015-07-01 12:02 ` papa
2015-07-01 15:18 ` Martin Sebor [this message]
2015-07-01 18:01 ` papa
2015-07-01 10:09 ` toUpper() Jonathan Wakely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=559404A3.30207@gmail.com \
--to=msebor@gmail.com \
--cc=gcc-help@gcc.gnu.org \
--cc=mingw-w64-public@lists.sourceforge.net \
--cc=papa@arbolone.ca \
--cc=rain.backnet@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).