From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 687323858C31; Sat, 9 Dec 2023 13:52:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 687323858C31 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1702129925; bh=zVX4/uv+X0+PC4K1Pm2MojJH+zpdlsoeKXPnGIrfqC4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=NhwpxVbL6LeOCiYa07k1jD3lz2NFX0JtBJSKceiLdJqWU+U7LiImvAd5QdHn+PeAJ Z1++PN72VJ7ap+Ubs2zfw4tMZPTWmx/JzK4D6ee4aEOlcA0eqv7QkRiouTkt+ZfYDs E2LUABTC02lwxM2Vegkr2O+tQpNBy86/NNQPTqyM= From: "luca.bacci at outlook dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug libstdc++/98723] On Windows with CP936 encoding, regex compiles very slow. Date: Sat, 09 Dec 2023 13:52:01 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libstdc++ X-Bugzilla-Version: 10.2.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: luca.bacci at outlook dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D98723 Luca Bacci changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |luca.bacci at outlook dot = com --- Comment #8 from Luca Bacci --- (In reply to Jonathan Wakely from comment #1) > The Windows behaviour fails to conform to the C and C++ standards. I think > _M_transform should check errno and throw an exception on error (which me= ans > removing the non-throwing exceptions specification from that function). Hi Jonathan! I'm giving it a go, but I have one question: which encoding are the strings passed to _M_transform() / _M_compare() in? (libstdc++-v3/config/locale/generic/collate_members.cc) is it the execution character set? Or is it always UTF-8? I am asking because we have to convert to UTF-16 and call wcsxfrm(). Many thanks, Luca=