From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id ACC653858002; Thu, 14 Jan 2021 12:04:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org ACC653858002 From: "egor_suvorov at mail dot ru" To: gcc-bugs@gcc.gnu.org Subject: [Bug libstdc++/98677] New: std::regex constructor triggers valgrind under clang++ with undefined sanitizer; possible use-after-move Date: Thu, 14 Jan 2021 12:04:06 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libstdc++ X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: egor_suvorov at mail dot ru X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jan 2021 12:04:06 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D98677 Bug ID: 98677 Summary: std::regex constructor triggers valgrind under clang++ with undefined sanitizer; possible use-after-move Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: egor_suvorov at mail dot ru Target Milestone: --- Consider the following code: #include int main() { std::regex regex("x{2,}"); } If I compile and run it at Ubuntu 20.04 with clang++-10 -fsanitize=3Dundefined -O2 -g a.cpp && valgrind ./a.out=20 I get the following error: =3D=3D2367=3D=3D Memcheck, a memory error detector =3D=3D2367=3D=3D Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward e= t al. =3D=3D2367=3D=3D Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyri= ght info =3D=3D2367=3D=3D Command: ./a.out =3D=3D2367=3D=3D=20 =3D=3D2367=3D=3D Conditional jump or move depends on uninitialised value(s) =3D=3D2367=3D=3D at 0x45AC3C: std::__detail::_StateSeq >::_M_clone() (regex_automaton.tcc:208) =3D=3D2367=3D=3D by 0x4341EA: std::__detail::_Compiler >::_M_quantifier() (regex_compiler.tcc:253) =3D=3D2367=3D=3D by 0x432F67: std::__detail::_Compiler >::_M_term() (regex_compiler.tcc:143) =3D=3D2367=3D=3D by 0x432B9A: std::__detail::_Compiler >::_M_alternative= () (regex_compiler.tcc:123) =3D=3D2367=3D=3D by 0x427E00: std::__detail::_Compiler >::_M_disjunction= () (regex_compiler.tcc:99) =3D=3D2367=3D=3D by 0x42747E: std::__detail::_Compiler >::_Compiler(char const*, char const*, std::locale const&, std::regex_constants::syntax_option_type) (regex_compiler.tcc:84) =3D=3D2367=3D=3D by 0x427149: __compile_nfa, const char *> (regex_compiler.h:183) =3D=3D2367=3D=3D by 0x427149: std::__cxx11::basic_regex >::basic_regex(char const*, c= har const*, std::locale, std::regex_constants::syntax_option_type) (regex.h:763) =3D=3D2367=3D=3D by 0x427025: basic_regex (regex.h:507) =3D=3D2367=3D=3D by 0x427025: basic_regex (regex.h:440) =3D=3D2367=3D=3D by 0x427025: main (a.cpp:3) =3D=3D2367=3D=3D=20 =3D=3D2367=3D=3D Conditional jump or move depends on uninitialised value(s) =3D=3D2367=3D=3D at 0x45AC3C: std::__detail::_StateSeq >::_M_clone() (regex_automaton.tcc:208) =3D=3D2367=3D=3D by 0x434218: std::__detail::_Compiler >::_M_quantifier() (regex_compiler.tcc:257) =3D=3D2367=3D=3D by 0x432F67: std::__detail::_Compiler >::_M_term() (regex_compiler.tcc:143) =3D=3D2367=3D=3D by 0x432B9A: std::__detail::_Compiler >::_M_alternative= () (regex_compiler.tcc:123) =3D=3D2367=3D=3D by 0x427E00: std::__detail::_Compiler >::_M_disjunction= () (regex_compiler.tcc:99) =3D=3D2367=3D=3D by 0x42747E: std::__detail::_Compiler >::_Compiler(char const*, char const*, std::locale const&, std::regex_constants::syntax_option_type) (regex_compiler.tcc:84) =3D=3D2367=3D=3D by 0x427149: __compile_nfa, const char *> (regex_compiler.h:183) =3D=3D2367=3D=3D by 0x427149: std::__cxx11::basic_regex >::basic_regex(char const*, c= har const*, std::locale, std::regex_constants::syntax_option_type) (regex.h:763) =3D=3D2367=3D=3D by 0x427025: basic_regex (regex.h:507) =3D=3D2367=3D=3D by 0x427025: basic_regex (regex.h:440) =3D=3D2367=3D=3D by 0x427025: main (a.cpp:3) =3D=3D2367=3D=3D=20 =3D=3D2367=3D=3D=20 =3D=3D2367=3D=3D HEAP SUMMARY: =3D=3D2367=3D=3D in use at exit: 0 bytes in 0 blocks =3D=3D2367=3D=3D total heap usage: 20 allocs, 20 frees, 76,776 bytes allo= cated =3D=3D2367=3D=3D=20 =3D=3D2367=3D=3D All heap blocks were freed -- no leaks are possible =3D=3D2367=3D=3D=20 =3D=3D2367=3D=3D Use --track-origins=3Dyes to see where uninitialised value= s come from =3D=3D2367=3D=3D For lists of detected and suppressed errors, rerun with: -s =3D=3D2367=3D=3D ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 0 fro= m 0) Any of the following actions remove the error: replacing clang++ with g++, disabling -fsanitize=3Dundefined, disabling -O2, switching to -stdlib=3Dlib= c++. Versions are: clang version 10.0.0-4ubuntu1=20 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin valgrind-3.15.0 libstdc++-10-dev/focal-updates,focal-security,now 10.2.0-5ubuntu1~20.04 amd= 64 [installed,automatic] A friend of mine suggested that it's probably caused by use-after-move of `__dup` in regex_automaton.tcc:206 (commit e45c41988bfd655b1df7cff8fcf111dc6fb732e3 at GitHub mirror) and vaguely suggested that maybe clang++ starts to implement some kind of destructive moves: auto __id =3D _M_nfa._M_insert_state(std::move(__dup)); __m[__u] =3D __id; if (__dup._M_has_alt())=