public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Erroneous Comparisons of Negative Characters
@ 2004-05-02  3:05 James W. McKelvey
  2004-05-02 12:05 ` llewelly
  0 siblings, 1 reply; 2+ messages in thread
From: James W. McKelvey @ 2004-05-02  3:05 UTC (permalink / raw)
  To: gcc-help

[-- Attachment #1: Type: text/plain, Size: 2220 bytes --]

When chars are implemented as signed, characters with negative values compare 
properly as individual chars, but improperly when part of a char array or 
std::string -- they compare as unsigned. The problem appears to be that the 
specialization of std::char_traits<char> uses memcmp. This is observed on an 
Alpha running RH 7.1 and gcc version 3.5.0 20040207.

Specifically, std::char_traits<char>::compare is inconsistent with 
std::char_traits<char>::lt, which affects std::string (which is just 
std::basic_string<char>.) The problem also affects strcmp, strncmp, and 
strcoll.

The attached program demonstrates the problem.

I post this because I want to be sure that there isn't some bizarre reason 
that this behavior is intended before I file a bug report; or maybe I am 
doing something wrong.

Result of running test program, with my analysis:

122      Expected character value of 'z'.
-64      Expected value of signed character '\0300'
192      Expected value of unsigned character '\0300'
-64      Shows that chars are signed

SC 1    Expected character comparison as signed char
UC 0    Expected character comparison as unsigned char
CH 1    Expected character comparison as char (signed)
SV 1    Demonstrates that std::string::value_type is signed
ST 0    Error: std::string of length 1 does not compare the same as CH
BS 1    std::basic_string<signed char> compares as expected
BU 0    std::basic_string<unsigned char> compares as expected
BC 0    Error: Same as ST, as expected (std::basic_string<char>)
TS 1    std::char_traits<signed char> compares signed char * as expected
TU 0    std::char_traits<signed char> compares unsigned char * as expected
TC 0    Error: std::char_traits<char>, char * does not compare properly
LS 1    std::char_traits<signed char> compares signed char as expected
LU 0    std::char_traits<unsigned char> compares unsigned char as expected
LC 1    std::char_traits<char> compares char as expected
          (Note: inconsistent with TC)
MC 0   std::memcmp is comparing as unsigned (I guess that's OK)
SC 0    Error: std::strcmp is comparing as unsigned
SN 0    Error: std::strncmp is comparing as unsigned
SL 0     Error: std::strcoll is comparing as unsigned

Comments?

[-- Attachment #2: zzz.cc --]
[-- Type: text/x-c, Size: 2876 bytes --]

#include <iostream>
#include <string>


int

main(int,
      char **)
{
    typedef signed char   sc;
    typedef unsigned char uc;

    std::cout << int('z')          << std::endl;
    std::cout << int(sc('\300'))   << std::endl;
    std::cout << int(uc('\300'))   << std::endl;
    std::cout << int(char('\300')) << std::endl << std::endl;

    std::cout << "SC " << (sc('z')   > sc('\300'))   << std::endl;
    std::cout << "UC " << (uc('z')   > uc('\300'))   << std::endl;
    std::cout << "CH " << (char('z') > char('\300')) << std::endl;

    std::cout << "SV " << (std::string::value_type('z') >
                           std::string::value_type('\300'))
                       << std::endl;

    std::cout << "ST " << (std::string(1, 'z').compare(
                               std::string(1, '\300')) > 0)
                       << std::endl;

    std::cout << "BS " << (std::basic_string<sc>(1, 'z').compare(
                               std::basic_string<sc>(1, '\300')) > 0)
                       << std::endl;

    std::cout << "BU " << (std::basic_string<uc>(1, 'z').compare(
                               std::basic_string<uc>(1, '\300')) > 0)
                       << std::endl;

    std::cout << "BC " << (std::basic_string<char>(1, 'z').compare(
                               std::basic_string<char>(1, '\300')) > 0)
                       << std::endl;

    std::cout << "TS " << (std::char_traits<sc>::compare(
                               reinterpret_cast<const sc *>("z"),
                               reinterpret_cast<const sc *>("\300"), 1) > 0)
                       << std::endl;

    std::cout << "TU " << (std::char_traits<uc>::compare(
                               reinterpret_cast<const uc *>("z"),
                               reinterpret_cast<const uc *>("\300"), 1) > 0)
                       << std::endl;

    std::cout << "TC " << (std::char_traits<char>::compare("z", "\300", 1) > 0)
                       << std::endl;

    std::cout << "LS " << (! std::char_traits<sc>::lt(static_cast<sc>('z'),
                                                      static_cast<sc>('\300')))
                       << std::endl;

    std::cout << "LU " << (! std::char_traits<uc>::lt(static_cast<uc>('z'),
                                                      static_cast<uc>('\300')))
                       << std::endl;

    std::cout << "LC " << (! std::char_traits<char>::lt('z', '\300'))
                       << std::endl;

    std::cout << "MC " << (std::memcmp("z", "\300", 1) > 0)
                       << std::endl;

    std::cout << "SC " << (std::strcmp("z", "\300") > 0)
                       << std::endl;

    std::cout << "SN " << (std::strncmp("z", "\300", 1) > 0)
                       << std::endl;

    std::cout << "SL " << (std::strcoll("z", "\300") > 0)
                       << std::endl;
    return 0;
}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Erroneous Comparisons of Negative Characters
  2004-05-02  3:05 Erroneous Comparisons of Negative Characters James W. McKelvey
@ 2004-05-02 12:05 ` llewelly
  0 siblings, 0 replies; 2+ messages in thread
From: llewelly @ 2004-05-02 12:05 UTC (permalink / raw)
  To: mckelvey; +Cc: gcc-help

James W. McKelvey <mckelvey@maskull.com> writes:

> When chars are implemented as signed, characters with negative values compare 
> properly as individual chars, but improperly when part of a char array or 
> std::string -- they compare as unsigned. The problem appears to be that the 
> specialization of std::char_traits<char> uses memcmp. This is observed on an 
> Alpha running RH 7.1 and gcc version 3.5.0 20040207.

I see the same behavior with gcc 3.4.0 on i686-freeBSD5.2

> 
> Specifically, std::char_traits<char>::compare is inconsistent with 
> std::char_traits<char>::lt, which affects std::string (which is just 
> std::basic_string<char>.) The problem also affects strcmp, strncmp, and 
> strcoll.
[snip]

std::char_traits<char>::compare is defined in terms of
    std::char_traits<char>::lt in 21.1.1/1 table 37, and I can't find
    a reason why they might be allowed to be inconsistent.

I think you should report a bug, see gcc.gnu.org/bugs.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-05-02 12:05 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-02  3:05 Erroneous Comparisons of Negative Characters James W. McKelvey
2004-05-02 12:05 ` llewelly

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).