From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16998 invoked by alias); 10 Oct 2014 11:03:39 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 16952 invoked by uid 48); 10 Oct 2014 11:03:35 -0000 From: "timshen at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/63497] std::regex can't handle [^class] correctly and cause runtime crash Date: Fri, 10 Oct 2014 11:03:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 4.9.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: timshen at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-10/txt/msg00762.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497 --- Comment #2 from Tim Shen --- Thanks for reporting :) This bug is still in trunk, it just somehow can't trigger the segfault or wrong output. The line trunk/bits/regex_executor.tcc:297 : if (__state._M_matches(*_M_current)) doesn't check if _M_current != _M_end. One way is to create a helper function that may fail to (return a false as a successfulness of dereference) dereference, but it's less efficient, unless the compiler can do reasoning + unnecessary predication elimination. I'm not sure about that. If you are interested, take a look at _Executor::_M_word_boundary and try to explode the first *_M_current (line 419 in trunk). My answer is: regex_match("", regex("\\b"), regex_constants::match_not_eol); For that case, can the compiler inline and eliminate unnecessary _M_current == _M_end, if we blindly check it (through some helper function) everywhere? Or, as I prefer, I can do a file scope wise check for each direct or indirect _M_current dereference.