From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 58125 invoked by alias); 1 Jul 2015 15:18:03 -0000 Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org Received: (qmail 58113 invoked by uid 89); 1 Jul 2015 15:18:02 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-qg0-f42.google.com Received: from mail-qg0-f42.google.com (HELO mail-qg0-f42.google.com) (209.85.192.42) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 01 Jul 2015 15:18:01 +0000 Received: by qgii30 with SMTP id i30so20093068qgi.1 for ; Wed, 01 Jul 2015 08:17:59 -0700 (PDT) X-Received: by 10.55.21.141 with SMTP id 13mr55681135qkv.101.1435763878936; Wed, 01 Jul 2015 08:17:58 -0700 (PDT) Received: from [192.168.0.26] (97-124-164-181.hlrn.qwest.net. [97.124.164.181]) by mx.google.com with ESMTPSA id l33sm1119811qkh.12.2015.07.01.08.17.56 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 Jul 2015 08:17:57 -0700 (PDT) Message-ID: <559404A3.30207@gmail.com> Date: Wed, 01 Jul 2015 15:18:00 -0000 From: Martin Sebor User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: papa@arbolone.ca, Riot , mingw-w64-public@lists.sourceforge.net CC: gcc-help Mailing List Subject: Re: [Mingw-w64-public] toUpper() References: <76E10177FB2B41508BFB693CE76AF944@ArbolOneLT> <559349E9.2020008@gmail.com> <309695B4BE814C0887F18FB11845328E@ArbolOneLT> In-Reply-To: <309695B4BE814C0887F18FB11845328E@ArbolOneLT> Content-Type: multipart/mixed; boundary="------------040200080406090702010004" X-IsSubscribed: yes X-SW-Source: 2015-07/txt/msg00008.txt.bz2 This is a multi-part message in MIME format. --------------040200080406090702010004 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Content-length: 4521 On 07/01/2015 06:02 AM, papa@arbolone.ca wrote: > std::wstring source(L"Hello World"); > std::wstring destination; > destination.resize(source.size()); > std::transform (source.begin(), source.end(), destination.begin(), > (int(*)(int))std::toupper); > > The above code is what did the trick, do not ask how, I am still > digesting it. However, any suggestions would be very much appreciated This solved problem (1) below but doesn't work correctly or portably because of the second problem I described in my first response. std::toupper(int) is defined for narrow characters in the range [0, UCHAR_MAX] plus EOF. The function has undefined behavior for characters outside that range (i.e., all wchar_t greater than UCHAR_MAX). I don't know what will happen on Windows(*) but on Linux, I can see the program doesn't work correctly for the Latin Extended Additional block of characters (the first one I noticed). For instance, running the attached modified version of the program in a UTF-8 locale such as en_US.utf8 to convert U+1EBD (LATIN SMALL LETTER E WITH TILDE) to its uppercase form (U+1EBC) prints: U+1EBD U+1EBC U+1EBD when the expected output is: U+1EBD U+1EBC U+1EBC If you want to use transform with wide characters, you need to use towupper (declared in ). Martin [*] I vaguely recall toupper and friends aborting on Windows when passed an out-of-range argument but I'm not 100% sure. > > -----Original Message----- From: Martin Sebor > Sent: Tuesday, June 30, 2015 10:01 PM > To: Riot ; mingw-w64-public@lists.sourceforge.net > Cc: gcc-help Mailing List > Subject: Re: [Mingw-w64-public] toUpper() > > On 06/30/2015 05:24 PM, Riot wrote: >> #include >> #include >> >> std::string str = "Hello World"; >> std::transform(str.begin(), str.end(), str.begin(), std::toupper); > > Please note this code is subtly incorrect for two reasons. > There are two overloads of std::toupper: > > 1) int toupper(int) declared in (and the equivalent > std::toupper in ) > 2) template charT std::toupper(T, const locale&) > in > > Without the right #include directive, the above may or may > not resolve to "the right" function (which depends on what > declarations the two headers bring into scope). > > When it resolves to (2) it will fail to compile. > > When it resolves to (1), it will do the wrong thing (have > undefined behavior) at runtime when char is a signed type > and the argument is negative (because (1) is only defined > for values between -1 and UCHAR_MAX). > > But the question is about converting std::wstring to upper > case and the above uses a narrow string. For wstring, the > std::ctype::toupper() function or its convenience > non-member template function can be used. > >> See also: http://www.cplusplus.com/reference/locale/toupper/ > > This is one possible way to do it. Another approach is along > these lines: > > std::locale loc (...); > std::wstring wstr = L"..."; > const std::ctype &ct = > std::use_facet >(loc); > ct.toupper (&wstr[0], &wstr[0] + wstr.size()); > > Martin > >> >> This may also help in future: http://lmgtfy.com/?q=c%2B%2B+toupper >> >> -Riot >> >> On 30 June 2015 at 23:58, wrote: >>> I would like to write a function to capitalize letters, say... >>> std::wstring toUpper(const std::wstring wstr){ >>> for ( auto it = wstr.begin(); it != wstr.end(); ++it){ >>> global_wapstr.append(std::towupper(&it)); >>> >>> } >>> } >>> >>> This doesn’t work, but doesn’t the standard already have something like >>> std::wstring::toUpper(...)? >>> >>> Thanks in advance >>> >>> >>> --- >>> This email has been checked for viruses by Avast antivirus software. >>> http://www.avast.com >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> Don't Limit Your Business. Reach for the Cloud. >>> GigeNET's Cloud Solutions provide you with the tools and support that >>> you need to offload your IT needs and focus on growing your business. >>> Configured For All Businesses. Start Your Cloud Today. >>> https://www.gigenetcloud.com/ >>> _______________________________________________ >>> Mingw-w64-public mailing list >>> Mingw-w64-public@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/mingw-w64-public > > > --- > This email has been checked for viruses by Avast antivirus software. > http://www.avast.com > --------------040200080406090702010004 Content-Type: text/x-c++src; name="u.cpp" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="u.cpp" Content-length: 608 #include #include #include #include #include #include int main () { if (!std::setlocale (LC_ALL, "")) return 1; // convert LATIN SMALL LETTER E WITH TILDE (U+1EBD) // to LATIN CAPITAL LETTER E WITH TILDE (U+1EBC) std::wstring source(L"\x1ebd"); std::wstring destination; destination.resize(source.size()); std::transform (source.begin(), source.end(), destination.begin(), (int(*)(int))std::toupper); printf ("U+%04X U+%04X U+%04X\n", source [0], towupper (source [0]), destination [0]); } --------------040200080406090702010004--