From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 7C4D73857C60; Wed, 13 Mar 2024 13:22:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7C4D73857C60 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1710336164; bh=PB1o9NiyhPkJCRDc+N9IasRwbXO49ydZ/hhw/SYPP+Y=; h=From:To:Subject:Date:In-Reply-To:References:From; b=RI77JLFUjkaElKpvqS2Eam2tMj/UezfNKFUZqqAbMs1T09JOO2WBQ90y2xBNZAr8i HbV4MMMthpjuwRW5spL//D51VXcn+KQleCvyK9NNiJAi3vC3DWpqt0VPNQKCFwjTTd WoGFscMZhIb4j/uebOom/xfCPEhkwganCxRvWueg= From: "ro at CeBiTec dot Uni-Bielefeld.DE" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/112652] g++.dg/cpp26/literals2.C FAILs Date: Wed, 13 Mar 2024 13:22:41 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: testsuite-fail X-Bugzilla-Severity: normal X-Bugzilla-Who: ro at CeBiTec dot Uni-Bielefeld.DE X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112652 --- Comment #6 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE Uni-Bielefeld.DE> --- >> --- Comment #4 from Jakub Jelinek --- >> Given that C++ says e.g. in https://eel.is/c++draft/lex.ccon#3.1 >> that program is ill-formed if some character lacks encoding in the execu= tion >> character set, I'm afraid the Solaris iconv behavior results in violatio= n of Although I can barely wrap my head around the standardese there, I had a look at n4928 (the last? C++23 draft), which has a different wording here (p.25, 5.13.3): (3.1) =E2=80=94 A character-literal with a c-char-sequence consisting of a single basic-c-char, simple-escape-sequence, or universal-character-name is the code unit value of the specified character as encoded in the literal=E2=80=99s associated character encoding. [Note 2 : If the specified character lacks representation in the literal=E2=80=99s associated character encoding or if it canno= t be encoded as a single code unit, then the literal is a non-encodable character literal. =E2=80=94end note > I've not yet tried to understand what either iconv(3) has to say on the > matter. Digging further, Solaris iconv(3C) has If iconv() encounters a character in the input buffer that is leg= al, but for which an identical character does not exist in the target c= ode set, iconv() performs an implementation-defined conversion on t= his character. which exactly matches XPG7, so the behaviour seems to be in line with the standards. I've also found that Solaris 11 has iconvctl(3C) (obviously patterened after GNU libiconv) with ICONV_SET_TRANSLITERATE With this request and a pointer to a const int with a non-z= ero value, caller can instruct the current conversion to transliter= ate non-identical characters from the input buffer during the code c= on- version as much as it can. The value of zero, on the other ha= nd, turns it off. However, int transliterate =3D 0; iconvctl (cd, ICONV_SET_TRANSLITERATE, &transliterate); doesn't make a difference. The current Solaris iconv behaviour certainly isn't particularly intuitive and I'll ask the Solaris engineers about it. However, there's the question what to do about the testcase? Just xfail it on Solaris or omit just the two affected subtests there?=