public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash
@ 2014-10-09 12:00 moophy at foxmail dot com
  2014-10-09 12:39 ` [Bug c++/63497] " redi at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: moophy at foxmail dot com @ 2014-10-09 12:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

            Bug ID: 63497
           Summary: std::regex can't handle [^class] correctly and cause
                    runtime crash
           Product: gcc
           Version: 4.9.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: moophy at foxmail dot com

Created attachment 33671
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33671&action=edit
preprocessed source files generated

When using pattern : "([a-z]*)([^c])(e)(i)([a-z]*)", it incorrectly matched the
input "cei" and caused runtime crash when iterating through the std::smatch
container.

Source:

#include <string>
#include <regex>
#include <iostream>

int main()
{
    std::string pattern("([a-z]*)([^c])(e)(i)([a-z]*)");
    std::regex r(pattern);
    std::smatch results;
    std::string test_str = "cei";

    if (std::regex_search(test_str, results, r))
    {
        std::cout << results.str() << std::endl;
        for (size_t i = 0; i < results.size(); ++i)
            std::cout << i << ": " << results[i].str() << '\n';
    }
}

output:
cei
0: cei
1: cei
terminate called after throwing an instance of 'std::length_error'
  what():  basic_string::_S_create
Aborted (core dumped)

Command Line:
g++-4.9 -v -save-temps -std=c++11 -Wall -Wextra main.cpp

GCC Info:
Using built-in specs.
COLLECT_GCC=g++-4.9
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/4.9/lto-wrapper
Target: i686-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
4.9.1-3ubuntu2~14.04.1' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs
--enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.9 --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls
--with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-vtable-verify
--enable-plugin --with-system-zlib --disable-browser-plugin
--enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-i386/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-i386
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-i386
--with-arch-directory=i386 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-targets=all --enable-multiarch --disable-werror
--with-arch-32=i686 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic --enable-checking=release --build=i686-linux-gnu
--host=i686-linux-gnu --target=i686-linux-gnu
Thread model: posix
gcc version 4.9.1 (Ubuntu 4.9.1-3ubuntu2~14.04.1) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++11' '-Wall' '-Wextra'
'-shared-libgcc' '-mtune=generic' '-march=i686'
 /usr/lib/gcc/i686-linux-gnu/4.9/cc1plus -E -quiet -v -imultiarch
i386-linux-gnu -D_GNU_SOURCE main.cpp -mtune=generic -march=i686 -std=c++11
-Wall -Wextra -fpch-preprocess -fstack-protector -Wformat-security -o main.ii
ignoring duplicate directory "/usr/include/i386-linux-gnu/c++/4.9"
ignoring nonexistent directory "/usr/local/include/i386-linux-gnu"
ignoring nonexistent directory
"/usr/lib/gcc/i686-linux-gnu/4.9/../../../../i686-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/include/c++/4.9
 /usr/include/i386-linux-gnu/c++/4.9
 /usr/include/c++/4.9/backward
 /usr/lib/gcc/i686-linux-gnu/4.9/include
 /usr/local/include
 /usr/lib/gcc/i686-linux-gnu/4.9/include-fixed
 /usr/include/i386-linux-gnu
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++11' '-Wall' '-Wextra'
'-shared-libgcc' '-mtune=generic' '-march=i686'
 /usr/lib/gcc/i686-linux-gnu/4.9/cc1plus -fpreprocessed main.ii -quiet
-dumpbase main.cpp -mtune=generic -march=i686 -auxbase main -Wall -Wextra
-std=c++11 -version -fstack-protector -Wformat-security -o main.s
GNU C++ (Ubuntu 4.9.1-3ubuntu2~14.04.1) version 4.9.1 (i686-linux-gnu)
    compiled by GNU C version 4.9.1, GMP version 5.1.3, MPFR version 3.1.2-p3,
MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C++ (Ubuntu 4.9.1-3ubuntu2~14.04.1) version 4.9.1 (i686-linux-gnu)
    compiled by GNU C version 4.9.1, GMP version 5.1.3, MPFR version 3.1.2-p3,
MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 55bd930871a1c0a5ebd28bf15548c7c2
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++11' '-Wall' '-Wextra'
'-shared-libgcc' '-mtune=generic' '-march=i686'
 as -v --32 -o main.o main.s
GNU assembler version 2.24 (i686-linux-gnu) using BFD version (GNU Binutils for
Ubuntu) 2.24
COMPILER_PATH=/usr/lib/gcc/i686-linux-gnu/4.9/:/usr/lib/gcc/i686-linux-gnu/4.9/:/usr/lib/gcc/i686-linux-gnu/:/usr/lib/gcc/i686-linux-gnu/4.9/:/usr/lib/gcc/i686-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/i686-linux-gnu/4.9/:/usr/lib/gcc/i686-linux-gnu/4.9/../../../i386-linux-gnu/:/usr/lib/gcc/i686-linux-gnu/4.9/../../../../lib/:/lib/i386-linux-gnu/:/lib/../lib/:/usr/lib/i386-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/i686-linux-gnu/4.9/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++11' '-Wall' '-Wextra'
'-shared-libgcc' '-mtune=generic' '-march=i686'
 /usr/lib/gcc/i686-linux-gnu/4.9/collect2 -plugin
/usr/lib/gcc/i686-linux-gnu/4.9/liblto_plugin.so
-plugin-opt=/usr/lib/gcc/i686-linux-gnu/4.9/lto-wrapper
-plugin-opt=-fresolution=main.res -plugin-opt=-pass-through=-lgcc_s
-plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc
-plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --sysroot=/
--build-id --eh-frame-hdr -m elf_i386 --hash-style=gnu --as-needed
-dynamic-linker /lib/ld-linux.so.2 -z relro
/usr/lib/gcc/i686-linux-gnu/4.9/../../../i386-linux-gnu/crt1.o
/usr/lib/gcc/i686-linux-gnu/4.9/../../../i386-linux-gnu/crti.o
/usr/lib/gcc/i686-linux-gnu/4.9/crtbegin.o -L/usr/lib/gcc/i686-linux-gnu/4.9
-L/usr/lib/gcc/i686-linux-gnu/4.9/../../../i386-linux-gnu
-L/usr/lib/gcc/i686-linux-gnu/4.9/../../../../lib -L/lib/i386-linux-gnu
-L/lib/../lib -L/usr/lib/i386-linux-gnu -L/usr/lib/../lib
-L/usr/lib/gcc/i686-linux-gnu/4.9/../../.. main.o -lstdc++ -lm -lgcc_s -lgcc
-lc -lgcc_s -lgcc /usr/lib/gcc/i686-linux-gnu/4.9/crtend.o
/usr/lib/gcc/i686-linux-gnu/4.9/../../../i386-linux-gnu/crtn.o


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
@ 2014-10-09 12:39 ` redi at gcc dot gnu.org
  2014-10-10 11:03 ` timshen at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2014-10-09 12:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2014-10-09
                 CC|                            |timshen at gcc dot gnu.org
      Known to work|                            |5.0
     Ever confirmed|0                           |1

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
This seems to be fixed on the trunk, but still crashes with 4.9.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
  2014-10-09 12:39 ` [Bug c++/63497] " redi at gcc dot gnu.org
@ 2014-10-10 11:03 ` timshen at gcc dot gnu.org
  2014-10-10 20:30 ` moophy at foxmail dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: timshen at gcc dot gnu.org @ 2014-10-10 11:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

--- Comment #2 from Tim Shen <timshen at gcc dot gnu.org> ---
Thanks for reporting :)

This bug is still in trunk, it just somehow can't trigger the segfault or wrong
output.

The line trunk/bits/regex_executor.tcc:297 :
           if (__state._M_matches(*_M_current))

doesn't check if _M_current != _M_end.

One way is to create a helper function that may fail to (return a false as a
successfulness of dereference) dereference, but it's less efficient, unless the
compiler can do reasoning + unnecessary predication elimination. I'm not sure
about that.

If you are interested, take a look at _Executor::_M_word_boundary and try to
explode the first *_M_current (line 419 in trunk). My answer is:
regex_match("", regex("\\b"), regex_constants::match_not_eol);

For that case, can the compiler inline and eliminate unnecessary _M_current ==
_M_end, if we blindly check it (through some helper function) everywhere?

Or, as I prefer, I can do a file scope wise check for each direct or indirect
_M_current dereference.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
  2014-10-09 12:39 ` [Bug c++/63497] " redi at gcc dot gnu.org
  2014-10-10 11:03 ` timshen at gcc dot gnu.org
@ 2014-10-10 20:30 ` moophy at foxmail dot com
  2014-10-10 21:07 ` timshen at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: moophy at foxmail dot com @ 2014-10-10 20:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

--- Comment #3 from Yue Wang <moophy at foxmail dot com> ---
Hi guys!

Thx for replying.
To be honest,as a newbie I'm not good enough to understand @Tim's reply...
Thx again for the efforts, hoping that gcc's std::regex would be better and
better.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
                   ` (2 preceding siblings ...)
  2014-10-10 20:30 ` moophy at foxmail dot com
@ 2014-10-10 21:07 ` timshen at gcc dot gnu.org
  2014-10-23  3:34 ` timshen at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: timshen at gcc dot gnu.org @ 2014-10-10 21:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

--- Comment #4 from Tim Shen <timshen at gcc dot gnu.org> ---
Hi Yue Wang, I'm sorry if my last reply looks scary. I should have put it in
libstdc++ list. I didn't mean to reply you with all implementation details.

Anyway, the cause is clear, and it will be fixed in trunk (and I believe that
it can be backported to 4.9 branch).

Thanks for reporting :)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
                   ` (3 preceding siblings ...)
  2014-10-10 21:07 ` timshen at gcc dot gnu.org
@ 2014-10-23  3:34 ` timshen at gcc dot gnu.org
  2014-11-23 14:30 ` [Bug libstdc++/63497] " paolo.carlini at oracle dot com
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: timshen at gcc dot gnu.org @ 2014-10-23  3:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

--- Comment #5 from Tim Shen <timshen at gcc dot gnu.org> ---
Author: timshen
Date: Thu Oct 23 03:15:52 2014
New Revision: 216572

URL: https://gcc.gnu.org/viewcvs?rev=216572&root=gcc&view=rev
Log:
    PR libstdc++/63497
    include/bits/regex_executor.h (_Executor::_M_word_boundary): Remove
    unused parameter.
    include/bits/regex_executor.tcc (_Executor::_M_dfs,
    _Executor::_M_word_boundary): Avoid dereferecing _M_current at _M_end
    or other invalid position.

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/include/bits/regex_executor.h
    trunk/libstdc++-v3/include/bits/regex_executor.tcc


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libstdc++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
                   ` (4 preceding siblings ...)
  2014-10-23  3:34 ` timshen at gcc dot gnu.org
@ 2014-11-23 14:30 ` paolo.carlini at oracle dot com
  2014-11-23 18:51 ` redi at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: paolo.carlini at oracle dot com @ 2014-11-23 14:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

Paolo Carlini <paolo.carlini at oracle dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|moophy at foxmail dot com          |
          Component|c++                         |libstdc++

--- Comment #6 from Paolo Carlini <paolo.carlini at oracle dot com> ---
Tim, please either apply to 4_9-branch too (ask Jon?), or just close the bug as
fixed for 5.0.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libstdc++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
                   ` (5 preceding siblings ...)
  2014-11-23 14:30 ` [Bug libstdc++/63497] " paolo.carlini at oracle dot com
@ 2014-11-23 18:51 ` redi at gcc dot gnu.org
  2014-11-28  6:51 ` timshen at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2014-11-23 18:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

--- Comment #7 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Yes, I think this is OK for the branch


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libstdc++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
                   ` (6 preceding siblings ...)
  2014-11-23 18:51 ` redi at gcc dot gnu.org
@ 2014-11-28  6:51 ` timshen at gcc dot gnu.org
  2014-11-28 20:24 ` timshen at gcc dot gnu.org
  2023-07-20 11:31 ` redi at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: timshen at gcc dot gnu.org @ 2014-11-28  6:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

--- Comment #8 from Tim Shen <timshen at gcc dot gnu.org> ---
Author: timshen
Date: Fri Nov 28 06:50:34 2014
New Revision: 218138

URL: https://gcc.gnu.org/viewcvs?rev=218138&root=gcc&view=rev
Log:
    PR libstdc++/63497
    include/bits/regex_executor.tcc (_Executor::_M_dfs,
    _Executor::_M_word_boundary): Avoid dereferecing _M_current at _M_end
    or other invalid position.

Modified:
    branches/gcc-4_9-branch/libstdc++-v3/ChangeLog
    branches/gcc-4_9-branch/libstdc++-v3/include/bits/regex_executor.tcc


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libstdc++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
                   ` (7 preceding siblings ...)
  2014-11-28  6:51 ` timshen at gcc dot gnu.org
@ 2014-11-28 20:24 ` timshen at gcc dot gnu.org
  2023-07-20 11:31 ` redi at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: timshen at gcc dot gnu.org @ 2014-11-28 20:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

Tim Shen <timshen at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #9 from Tim Shen <timshen at gcc dot gnu.org> ---
Mark as resolved


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libstdc++/63497] std::regex can't handle [^class] correctly and cause runtime crash
  2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
                   ` (8 preceding siblings ...)
  2014-11-28 20:24 ` timshen at gcc dot gnu.org
@ 2023-07-20 11:31 ` redi at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2023-07-20 11:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63497

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.9.3

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-07-20 11:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-09 12:00 [Bug c++/63497] New: std::regex can't handle [^class] correctly and cause runtime crash moophy at foxmail dot com
2014-10-09 12:39 ` [Bug c++/63497] " redi at gcc dot gnu.org
2014-10-10 11:03 ` timshen at gcc dot gnu.org
2014-10-10 20:30 ` moophy at foxmail dot com
2014-10-10 21:07 ` timshen at gcc dot gnu.org
2014-10-23  3:34 ` timshen at gcc dot gnu.org
2014-11-23 14:30 ` [Bug libstdc++/63497] " paolo.carlini at oracle dot com
2014-11-23 18:51 ` redi at gcc dot gnu.org
2014-11-28  6:51 ` timshen at gcc dot gnu.org
2014-11-28 20:24 ` timshen at gcc dot gnu.org
2023-07-20 11:31 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).