From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 55B073858D39 for ; Wed, 22 Sep 2021 09:49:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 55B073858D39 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: +k0qRO0g4F7Fy2kcHsYPVyO946GOwB2fN5T39uUfYyWf+aatHmNbVbcHIxYmGBr0GQpNPKkfGg 30vE+VSKjec3m7hAUBS51vO+5OXrmliXCePy2ZFIZxySv80FnA3QwUZywqBHIfeWg6ZHR0TwWT reaMgRGFg0ki9ivLsWB+PxjfJC7gkLOYww+p9H1TZyugrhqrWgRqoirLhCzsQOvWlGzZof1OR5 RQiAF0eeWKLWAFQx80k5iGBD095R8zN76owb0dQzovFCXMccxYIPEp+E4tM6BPOceRZCpqFP/B ImHpEMnoRlUxQ3VuWeMDcH/o X-IronPort-AV: E=Sophos;i="5.85,313,1624348800"; d="scan'208";a="66150111" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 22 Sep 2021 01:49:09 -0800 IronPort-SDR: 7YjNMytNctuAKvCWMhhkD64cchYkdTi3t9iAEE4qg4o+xq+lcPQXpZTcAewuwndsX29wQxoidC Y6V6DPAdrSuxKC/tj1c8a1Novt+f88qz2lRhRFgEmfFnJ0v/7Fy+CpHE9kWZidRRFRPYTSk0/r E6XMbFM5ijyb4rOE6kEoK3t9Y83cGXDa07N7nFCfb0pk1AqpnBe6D35HGrc/utAOMOw54RbjCW YzwlMA3uHLT5REcuUO/l07qDY9M1Xbw5P7zlSIUWKSD1s9lTlYf+PR78rwibThctkqSN+Qpv/I bH8= From: Thomas Schwinge To: Mark Wielaard CC: Subject: Re: [PATCH] Fix byte char and byte string lexing code In-Reply-To: <20210921225430.166550-1-mark@klomp.org> References: <20210921225430.166550-1-mark@klomp.org> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Wed, 22 Sep 2021 11:48:56 +0200 Message-ID: <87k0j9ym7r.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00, BODY_8BITS, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-rust@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: gcc-rust mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Sep 2021 09:49:11 -0000 Hi Mark! On 2021-09-22T00:54:30+0200, Mark Wielaard wrote: > There were two warnings in lexer parse_byte_char and parse_byte_string > code for arches with signed chars: > > rust-lex.cc: In member function > =E2=80=98Rust::TokenPtr Rust::Lexer::parse_byte_char(Locatio= n)=E2=80=99: > rust-lex.cc:1564:21: warning: comparison is always false due to limited > range of data type [-Wtype-limits] > 1564 | if (byte_char > 127) > | ~~~~~~~~~~^~~~~ That's . > rust-lex.cc: In member function > =E2=80=98Rust::TokenPtr Rust::Lexer::parse_byte_string(Locat= ion)=E2=80=99: > rust-lex.cc:1639:27: warning: comparison is always false due to limited > range of data type [-Wtype-limits] > 1639 | if (output_char > 127) > | ~~~~~~~~~~~~^~~~~ That's . Both these related to "GCC '--enable-bootstrap' build". > The fix would be to cast to an unsigned char before the comparison. > But that is actually wrong, and would produce the following errors > parsing a byte char or byte string: > > bytecharstring.rs:3:14: error: =E2=80=98byte char=E2=80=99 =E2=80=98=EF= =BF=BD=E2=80=99 out of range > 3 | let _bc =3D b'\x80'; > | ^ > bytecharstring.rs:4:14: error: character =E2=80=98=EF=BF=BD=E2=80=99 in b= yte string out of range > 4 | let _bs =3D b"foo\x80bar"; > | ^ > > Both byte chars and byte strings may contain up to \xFF (255) > characters. It is utf-8 chars or strings that can only [truncated here --= but I understand what you mean] I think this does match my thoughts in . > Remove the faulty check and add a new testcase bytecharstring.rs > that checks byte chars and strings do accept > 127 hex char > escapes, but utf-8 chars and strings reject such hex char escapes. > --- > > https://code.wildebeest.org/git/user/mjw/gccrs/commit/?h=3Dbytecharstring Thanks, that's now: "Fix byte char and byte string lexing code". Gr=C3=BC=C3=9Fe Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955