* __memmove_avx_unaligned_erms throws segmentation fault in release mode
[not found] <343253870.2815115.1625762524578.ref@mail.yahoo.com>
@ 2021-07-08 16:42 ` Mahmood N
2021-07-09 7:57 ` Xi Ruoyao
0 siblings, 1 reply; 6+ messages in thread
From: Mahmood N @ 2021-07-08 16:42 UTC (permalink / raw)
To: gcc-help
Hi
I see that my program works fine in the debug mode, but not in the release mode. With GDB I was able to find the function that got error. The code looks like
std::vector<std::vector<inst_trace_t> *> threadblock_traces;
...
printf("hello %d\n",threadblock_traces.size());
trace_kernel.get_next_threadblock_traces(threadblock_traces);
At the printf(), I see the size is 4. So, the vector is not empty. According to GDB, the backtrace is
Program received signal SIGSEGV, Segmentation fault.
__memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384
384 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0 __memmove_avx_unaligned_erms ()
at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384
#1 0x0000555555569849 in std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> (__result=<optimized out>,
__last=<optimized out>, __first=<optimized out>) at /usr/include/c++/9/bits/stl_algobase.h:465
#2 std::__copy_move_a<false, std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>,
__last=<optimized out>, __first=<optimized out>) at /usr/include/c++/9/bits/stl_algobase.h:404
#3 std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>, __last=..., __first=...)
at /usr/include/c++/9/bits/stl_algobase.h:440
#4 std::copy<__gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>, __last=..., __first=...)
at /usr/include/c++/9/bits/stl_algobase.h:474
#5 std::__uninitialized_copy<true>::__uninit_copy<__gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>, __last=...,
__first=...) at /usr/include/c++/9/bits/stl_uninitialized.h:101
#6 std::uninitialized_copy<__gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>, __last=..., __first=...)
at /usr/include/c++/9/bits/stl_uninitialized.h:140
#7 std::__uninitialized_copy_a<__gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**, std::vector<inst_trace_t, std::allocator<inst_trace_t> >*>
(__result=<optimized out>, __last=..., __first=...)
at /usr/include/c++/9/bits/stl_uninitialized.h:307
#8 std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::vector (
__x=std::vector of length 4, capacity 4 = {...}, this=0x7fffffffc4f0)
--Type <RET> for more, q to quit, c to continue without paging--
at /usr/include/c++/9/bits/stl_vector.h:555
#9 trace_shader_core_ctx::init_traces (this=0x55555696ec30, start_warp=0, end_warp=4, kernel=...)
at trace_driven.cc:486
With readelf command, you can see the compile options for both debug and release modes:
RELEASE:
$ readelf -p .GCC.command.line gpu-simulator/bin/release/accel-sim.out
String dump of section '.GCC.command.line':
[ 0] -I ./build/release
[ 13] -I ./trace-driven
[ 25] -I ./trace-parser
[ 37] -I /home/mahmood/accel-sim-framework/gpu-simulator/gpgpu-sim/libcuda
[ 7d] -I /home/mahmood/accel-sim-framework/gpu-simulator/gpgpu-sim/src
[ bf] -I /usr/local/cuda-11.2/include
[ df] -imultiarch x86_64-linux-gnu
[ fc] -D_GNU_SOURCE
[ 10a] main.cc
[ 112] -mtune=generic
[ 121] -march=x86-64
[ 12f] -auxbase-strip ./build/release/main.o
[ 155] -g3
[ 159] -O3
[ 15d] -Wall
[ 163] -std=c++11
[ 16e] -fPIC
[ 174] -frecord-gcc-switches
[ 18a] -fasynchronous-unwind-tables
[ 1a7] -fstack-protector-strong
[ 1c0] -Wformat-security
[ 1d2] -fstack-clash-protection
[ 1eb] -fcf-protection
DEBUG:
$ readelf -p .GCC.command.line gpu-simulator/bin/debug/accel-sim.out
String dump of section '.GCC.command.line':
[ 0] -I ./build/debug
[ 11] -I ./trace-driven
[ 23] -I ./trace-parser
[ 35] -I /home/mahmood/accel-sim-framework/gpu-simulator/gpgpu-sim/libcuda
[ 7b] -I /home/mahmood/accel-sim-framework/gpu-simulator/gpgpu-sim/src
[ bd] -I /usr/local/cuda-11.2/include
[ dd] -imultiarch x86_64-linux-gnu
[ fa] -D_GNU_SOURCE
[ 108] main.cc
[ 110] -mtune=generic
[ 11f] -march=x86-64
[ 12d] -auxbase-strip ./build/debug/main.o
[ 151] -g3
[ 155] -O0
[ 159] -Wall
[ 15f] -std=c++11
[ 16a] -fPIC
[ 170] -frecord-gcc-switches
[ 186] -fasynchronous-unwind-tables
[ 1a3] -fstack-protector-strong
[ 1bc] -Wformat-security
[ 1ce] -fstack-clash-protection
[ 1e7] -fcf-protection
I would like to know if a similar problem has been reported before.
Any idea about that?
Regards,
Mahmood
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
2021-07-08 16:42 ` __memmove_avx_unaligned_erms throws segmentation fault in release mode Mahmood N
@ 2021-07-09 7:57 ` Xi Ruoyao
2021-07-10 16:18 ` Mahmood Naderan
0 siblings, 1 reply; 6+ messages in thread
From: Xi Ruoyao @ 2021-07-09 7:57 UTC (permalink / raw)
To: Mahmood N, gcc-help; +Cc: gcc-help
On Thu, 2021-07-08 at 16:42 +0000, Mahmood N via Gcc-help wrote:
> Hi
>
> I see that my program works fine in the debug mode, but not in the
> release mode. With GDB I was able to find the function that got error.
> The code looks like
>
>
>
> std::vector<std::vector<inst_trace_t> *> threadblock_traces;
> ...
> printf("hello %d\n",threadblock_traces.size());
At least you can't use "%d" to print the result of
std::vector<T>::size() (which is not an `int`).
> trace_kernel.get_next_threadblock_traces(threadblock_traces);
>
Try to find the bug of your code with warnings, sanitizers, or
-D_GLIBCXX_DEBUG.
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
2021-07-09 7:57 ` Xi Ruoyao
@ 2021-07-10 16:18 ` Mahmood Naderan
2021-07-10 17:14 ` Mahmood Naderan
0 siblings, 1 reply; 6+ messages in thread
From: Mahmood Naderan @ 2021-07-10 16:18 UTC (permalink / raw)
To: gcc-help
>Try to find the bug of your code with warnings, sanitizers, or
>-D_GLIBCXX_DEBUG.
I used this:
CXXFLAGS = -Wall -O2 -g3 -fPIC -std=c++11 -frecord-gcc-switches -D_GLIBCXX_DEBUG
But it seems that this is not a correct place or the syntax is wrong because I see
[ 10a] -D _GLIBCXX_DEBUG
In the output of readelf command.
Regards,
Mahmood
On Friday, July 9, 2021, 9:57:26 AM GMT+2, Xi Ruoyao <xry111@mengyan1223.wang> wrote:
On Thu, 2021-07-08 at 16:42 +0000, Mahmood N via Gcc-help wrote:
> Hi
>
> I see that my program works fine in the debug mode, but not in the
> release mode. With GDB I was able to find the function that got error.
> The code looks like
>
>
>
> std::vector<std::vector<inst_trace_t> *> threadblock_traces;
> ...
> printf("hello %d\n",threadblock_traces.size());
At least you can't use "%d" to print the result of
std::vector<T>::size() (which is not an `int`).
> trace_kernel.get_next_threadblock_traces(threadblock_traces);
>
Try to find the bug of your code with warnings, sanitizers, or
-D_GLIBCXX_DEBUG.
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
2021-07-10 16:18 ` Mahmood Naderan
@ 2021-07-10 17:14 ` Mahmood Naderan
2021-07-10 18:16 ` Xi Ruoyao
0 siblings, 1 reply; 6+ messages in thread
From: Mahmood Naderan @ 2021-07-10 17:14 UTC (permalink / raw)
To: gcc-help
OK. I tried to do some more tricks and got new signs of crash.
I edited the Makefile in the folder that trace_driven.cc is compiled and instead of using -O3, I used -O1 and now I get
free(): double free detected in tcache 2
At the same like that I received segfault with -O3.
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff77de859 in __GI_abort () at abort.c:79
#2 0x00007ffff78493ee in __libc_message (action=action@entry=do_abort,
fmt=fmt@entry=0x7ffff7973285 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007ffff785147c in malloc_printerr (
str=str@entry=0x7ffff79755d0 "free(): double free detected in tcache 2") at malloc.c:5347
#4 0x00007ffff78530ed in _int_free (av=0x7ffff79a4b80 <main_arena>, p=0x5555fdbbd7f0,
have_lock=0) at malloc.c:4201
#5 0x0000555555562a88 in __gnu_cxx::new_allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*>::deallocate (__p=<optimized out>, this=0x7fffffffc5a0)
at /usr/include/c++/9/ext/new_allocator.h:119
#6 std::allocator_traits<std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::deallocate (__n=<optimized out>, __p=<optimized out>, __a=...)
at /usr/include/c++/9/bits/alloc_traits.h:470
#7 std::_Vector_base<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::_M_deallocate (__n=<optimized out>,
__p=<optimized out>, this=0x7fffffffc5a0) at /usr/include/c++/9/bits/stl_vector.h:351
#8 std::_Vector_base<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::~_Vector_base (this=0x7fffffffc5a0,
__in_chrg=<optimized out>) at /usr/include/c++/9/bits/stl_vector.h:332
#9 std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::~vector (this=0x7fffffffc5a0,
__in_chrg=<optimized out>) at /usr/include/c++/9/bits/stl_vector.h:680
#10 trace_shader_core_ctx::init_traces (this=this@entry=0x55555696fb20, start_warp=0,
end_warp=end_warp@entry=4, kernel=...) at trace_driven.cc:487
The code looks like
std::vector<std::vector<inst_trace_t> *> threadblock_traces;
for (unsigned i = start_warp; i < end_warp; ++i) {
trace_shd_warp_t *m_trace_warp = static_cast<trace_shd_warp_t *>(m_warp[i]);
m_trace_warp->clear();
threadblock_traces.push_back(&(m_trace_warp->warp_traces));
}
trace_kernel_info_t &trace_kernel =
static_cast<trace_kernel_info_t &>(kernel);
printf("hello %d\n",threadblock_traces.size());
trace_kernel.get_next_threadblock_traces(threadblock_traces);
Any feedback is appreciated.
Regards,
Mahmood
On Friday, July 9, 2021, 9:57:26 AM GMT+2, Xi Ruoyao <xry111@mengyan1223.wang> wrote:
On Thu, 2021-07-08 at 16:42 +0000, Mahmood N via Gcc-help wrote:
> Hi
>
> I see that my program works fine in the debug mode, but not in the
> release mode. With GDB I was able to find the function that got error.
> The code looks like
>
>
>
> std::vector<std::vector<inst_trace_t> *> threadblock_traces;
> ...
> printf("hello %d\n",threadblock_traces.size());
At least you can't use "%d" to print the result of
std::vector<T>::size() (which is not an `int`).
> trace_kernel.get_next_threadblock_traces(threadblock_traces);
>
Try to find the bug of your code with warnings, sanitizers, or
-D_GLIBCXX_DEBUG.
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
2021-07-10 17:14 ` Mahmood Naderan
@ 2021-07-10 18:16 ` Xi Ruoyao
2021-07-11 8:05 ` Mahmood Naderan
0 siblings, 1 reply; 6+ messages in thread
From: Xi Ruoyao @ 2021-07-10 18:16 UTC (permalink / raw)
To: Mahmood Naderan, gcc-help
On Sat, 2021-07-10 at 17:14 +0000, Mahmood Naderan via Gcc-help wrote:
> OK. I tried to do some more tricks and got new signs of crash.
> I edited the Makefile in the folder that trace_driven.cc is compiled
> and instead of using -O3, I used -O1 and now I get
>
>
> free(): double free detected in tcache 2
It indicates a bug in your code (in 99.99% chance).
Even if you paid money for a commercial compiler, its supporting team
wouldn't help you to debug your code.
> std::vector<std::vector<inst_trace_t> *> threadblock_traces;
> for (unsigned i = start_warp; i < end_warp; ++i) {
> trace_shd_warp_t *m_trace_warp = static_cast<trace_shd_warp_t
> *>(m_warp[i]);
> m_trace_warp->clear();
> threadblock_traces.push_back(&(m_trace_warp->warp_traces));
> }
> trace_kernel_info_t &trace_kernel =
> static_cast<trace_kernel_info_t &>(kernel);
> printf("hello %d\n",threadblock_traces.size());
> trace_kernel.get_next_threadblock_traces(threadblock_traces);
There is no way to determine if this snippnet is correct, as its
behavior depends on what "m_trace_wrap" is, what "clear" does, etc.
And you are still using %d for size_t, which is an undefined behavior.
It seems you don't know what undefined behavior is, so it's very likely
there are more cases of undefined behavior in your code.
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
2021-07-10 18:16 ` Xi Ruoyao
@ 2021-07-11 8:05 ` Mahmood Naderan
0 siblings, 0 replies; 6+ messages in thread
From: Mahmood Naderan @ 2021-07-11 8:05 UTC (permalink / raw)
To: gcc-help
>Even if you paid money for a commercial compiler, its supporting team
>wouldn't help you to debug your code.
I just wanted to see if that STL vector error has been reported or if there is any hint to find the bug that why debug mode works but not -O1 or -O2 or -O3.
Regards,
Mahmood
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-07-11 8:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <343253870.2815115.1625762524578.ref@mail.yahoo.com>
2021-07-08 16:42 ` __memmove_avx_unaligned_erms throws segmentation fault in release mode Mahmood N
2021-07-09 7:57 ` Xi Ruoyao
2021-07-10 16:18 ` Mahmood Naderan
2021-07-10 17:14 ` Mahmood Naderan
2021-07-10 18:16 ` Xi Ruoyao
2021-07-11 8:05 ` Mahmood Naderan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).