From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id C9D7E3858C60; Sat, 2 Oct 2021 06:55:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C9D7E3858C60 From: "redi at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug libstdc++/102447] std::regex incorrectly accepts invalid bracket expression Date: Sat, 02 Oct 2021 06:55:39 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libstdc++ X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: redi at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: redi at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2021 06:55:39 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102447 --- Comment #6 from Jonathan Wakely --- I have looked in detail (I have the 3rd, 4th and 5th editions here) but my brain started oozing out of my ears. 15.10.2.15 NonemptyClassRanges and 15.10.2.16 NonemptyClassRangesNoDash are= the relevant sections of the 1999 3rd edition. The former defines: The internal helper function CharacterRange takes two CharSet parameters A and B and performs the following: 1. If A does not contain exactly one character or B does not contain exac= tly one character then throw a SyntaxError exception. And the latter has this note: Informative comments: ClassRanges can expand into single ClassAtoms and/or ranges of two ClassAtoms separated by dashes. In the latter case the ClassRanges includes all characters between the first ClassAtom and the second ClassAtom, inclusive; an error occurs if either ClassAtom does not represent a single character (for example, if one is \w) or if the first ClassAtom's code point value is greater than the second ClassAtom's code point value. The ClassAtom \w does not contain exactly one character, so I think it's a syntax error. The 3rd edition doesn't mention any legacy features of RegExp, but it does = seem to require the strict behaviour.=