public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* __memmove_avx_unaligned_erms throws segmentation fault in release mode
       [not found] <343253870.2815115.1625762524578.ref@mail.yahoo.com>
@ 2021-07-08 16:42 ` Mahmood N
  2021-07-09  7:57   ` Xi Ruoyao
  0 siblings, 1 reply; 6+ messages in thread
From: Mahmood N @ 2021-07-08 16:42 UTC (permalink / raw)
  To: gcc-help

Hi

I see that my program works fine in the debug mode, but not in the release mode. With GDB I was able to find the function that got error. The code looks like



  std::vector<std::vector<inst_trace_t> *> threadblock_traces;
  ...
  printf("hello %d\n",threadblock_traces.size());
  trace_kernel.get_next_threadblock_traces(threadblock_traces);



At the printf(), I see the size is 4. So, the vector is not empty. According to GDB, the backtrace is



Program received signal SIGSEGV, Segmentation fault.
__memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384
384     ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0  __memmove_avx_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384
#1  0x0000555555569849 in std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> (__result=<optimized out>,
    __last=<optimized out>, __first=<optimized out>) at /usr/include/c++/9/bits/stl_algobase.h:465
#2  std::__copy_move_a<false, std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>,
    __last=<optimized out>, __first=<optimized out>) at /usr/include/c++/9/bits/stl_algobase.h:404
#3  std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>, __last=..., __first=...)
    at /usr/include/c++/9/bits/stl_algobase.h:440
#4  std::copy<__gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>, __last=..., __first=...)
    at /usr/include/c++/9/bits/stl_algobase.h:474
#5  std::__uninitialized_copy<true>::__uninit_copy<__gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>, __last=...,
    __first=...) at /usr/include/c++/9/bits/stl_uninitialized.h:101
#6  std::uninitialized_copy<__gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**> (__result=<optimized out>, __last=..., __first=...)
    at /usr/include/c++/9/bits/stl_uninitialized.h:140
#7  std::__uninitialized_copy_a<__gnu_cxx::__normal_iterator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >* const*, std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> > >, std::vector<inst_trace_t, std::allocator<inst_trace_t> >**, std::vector<inst_trace_t, std::allocator<inst_trace_t> >*>
    (__result=<optimized out>, __last=..., __first=...)
    at /usr/include/c++/9/bits/stl_uninitialized.h:307
#8  std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::vector (
    __x=std::vector of length 4, capacity 4 = {...}, this=0x7fffffffc4f0)
--Type <RET> for more, q to quit, c to continue without paging--
    at /usr/include/c++/9/bits/stl_vector.h:555
#9  trace_shader_core_ctx::init_traces (this=0x55555696ec30, start_warp=0, end_warp=4, kernel=...)
    at trace_driven.cc:486





With readelf command, you can see the compile options  for both debug and release modes:


RELEASE:
$ readelf -p .GCC.command.line gpu-simulator/bin/release/accel-sim.out

String dump of section '.GCC.command.line':
  [     0]  -I ./build/release
  [    13]  -I ./trace-driven
  [    25]  -I ./trace-parser
  [    37]  -I /home/mahmood/accel-sim-framework/gpu-simulator/gpgpu-sim/libcuda
  [    7d]  -I /home/mahmood/accel-sim-framework/gpu-simulator/gpgpu-sim/src
  [    bf]  -I /usr/local/cuda-11.2/include
  [    df]  -imultiarch x86_64-linux-gnu
  [    fc]  -D_GNU_SOURCE
  [   10a]  main.cc
  [   112]  -mtune=generic
  [   121]  -march=x86-64
  [   12f]  -auxbase-strip ./build/release/main.o
  [   155]  -g3
  [   159]  -O3
  [   15d]  -Wall
  [   163]  -std=c++11
  [   16e]  -fPIC
  [   174]  -frecord-gcc-switches
  [   18a]  -fasynchronous-unwind-tables
  [   1a7]  -fstack-protector-strong
  [   1c0]  -Wformat-security
  [   1d2]  -fstack-clash-protection
  [   1eb]  -fcf-protection



DEBUG:
$ readelf -p .GCC.command.line gpu-simulator/bin/debug/accel-sim.out

String dump of section '.GCC.command.line':
  [     0]  -I ./build/debug
  [    11]  -I ./trace-driven
  [    23]  -I ./trace-parser
  [    35]  -I /home/mahmood/accel-sim-framework/gpu-simulator/gpgpu-sim/libcuda
  [    7b]  -I /home/mahmood/accel-sim-framework/gpu-simulator/gpgpu-sim/src
  [    bd]  -I /usr/local/cuda-11.2/include
  [    dd]  -imultiarch x86_64-linux-gnu
  [    fa]  -D_GNU_SOURCE
  [   108]  main.cc
  [   110]  -mtune=generic
  [   11f]  -march=x86-64
  [   12d]  -auxbase-strip ./build/debug/main.o
  [   151]  -g3
  [   155]  -O0
  [   159]  -Wall
  [   15f]  -std=c++11
  [   16a]  -fPIC
  [   170]  -frecord-gcc-switches
  [   186]  -fasynchronous-unwind-tables
  [   1a3]  -fstack-protector-strong
  [   1bc]  -Wformat-security
  [   1ce]  -fstack-clash-protection
  [   1e7]  -fcf-protection




I would like to know if a similar problem has been reported before.

Any idea about that?

Regards,
Mahmood

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
  2021-07-08 16:42 ` __memmove_avx_unaligned_erms throws segmentation fault in release mode Mahmood N
@ 2021-07-09  7:57   ` Xi Ruoyao
  2021-07-10 16:18     ` Mahmood Naderan
  0 siblings, 1 reply; 6+ messages in thread
From: Xi Ruoyao @ 2021-07-09  7:57 UTC (permalink / raw)
  To: Mahmood N, gcc-help; +Cc: gcc-help

On Thu, 2021-07-08 at 16:42 +0000, Mahmood N via Gcc-help wrote:
> Hi
> 
> I see that my program works fine in the debug mode, but not in the
> release mode. With GDB I was able to find the function that got error.
> The code looks like
> 
> 
> 
>   std::vector<std::vector<inst_trace_t> *> threadblock_traces;
>   ...
>   printf("hello %d\n",threadblock_traces.size());

At least you can't use "%d" to print the result of
std::vector<T>::size() (which is not an `int`).

>   trace_kernel.get_next_threadblock_traces(threadblock_traces);
> 

Try to find the bug of your code with warnings, sanitizers, or
-D_GLIBCXX_DEBUG.

-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
  2021-07-09  7:57   ` Xi Ruoyao
@ 2021-07-10 16:18     ` Mahmood Naderan
  2021-07-10 17:14       ` Mahmood Naderan
  0 siblings, 1 reply; 6+ messages in thread
From: Mahmood Naderan @ 2021-07-10 16:18 UTC (permalink / raw)
  To: gcc-help

>Try to find the bug of your code with warnings, sanitizers, or
>-D_GLIBCXX_DEBUG.

I used this:

CXXFLAGS = -Wall -O2 -g3 -fPIC -std=c++11 -frecord-gcc-switches -D_GLIBCXX_DEBUG


But it seems that this is not a correct place or the syntax is wrong because I see 

  [   10a]  -D _GLIBCXX_DEBUG



In the output of readelf command.





Regards,
Mahmood






On Friday, July 9, 2021, 9:57:26 AM GMT+2, Xi Ruoyao <xry111@mengyan1223.wang> wrote: 





On Thu, 2021-07-08 at 16:42 +0000, Mahmood N via Gcc-help wrote:
> Hi
> 
> I see that my program works fine in the debug mode, but not in the
> release mode. With GDB I was able to find the function that got error.
> The code looks like
> 
> 
> 
>   std::vector<std::vector<inst_trace_t> *> threadblock_traces;
>   ...
>   printf("hello %d\n",threadblock_traces.size());

At least you can't use "%d" to print the result of
std::vector<T>::size() (which is not an `int`).


>   trace_kernel.get_next_threadblock_traces(threadblock_traces);

> 

Try to find the bug of your code with warnings, sanitizers, or
-D_GLIBCXX_DEBUG.

-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
  2021-07-10 16:18     ` Mahmood Naderan
@ 2021-07-10 17:14       ` Mahmood Naderan
  2021-07-10 18:16         ` Xi Ruoyao
  0 siblings, 1 reply; 6+ messages in thread
From: Mahmood Naderan @ 2021-07-10 17:14 UTC (permalink / raw)
  To: gcc-help

OK. I tried to do some more tricks and got new signs of crash.
I edited the Makefile in the folder that trace_driven.cc is compiled and instead of using -O3, I used -O1 and now I get 


free(): double free detected in tcache 2


At the same like that I received segfault with -O3.




(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff77de859 in __GI_abort () at abort.c:79
#2  0x00007ffff78493ee in __libc_message (action=action@entry=do_abort,
    fmt=fmt@entry=0x7ffff7973285 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x00007ffff785147c in malloc_printerr (
    str=str@entry=0x7ffff79755d0 "free(): double free detected in tcache 2") at malloc.c:5347
#4  0x00007ffff78530ed in _int_free (av=0x7ffff79a4b80 <main_arena>, p=0x5555fdbbd7f0,
    have_lock=0) at malloc.c:4201
#5  0x0000555555562a88 in __gnu_cxx::new_allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*>::deallocate (__p=<optimized out>, this=0x7fffffffc5a0)
    at /usr/include/c++/9/ext/new_allocator.h:119
#6  std::allocator_traits<std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::deallocate (__n=<optimized out>, __p=<optimized out>, __a=...)
    at /usr/include/c++/9/bits/alloc_traits.h:470
#7  std::_Vector_base<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::_M_deallocate (__n=<optimized out>,
    __p=<optimized out>, this=0x7fffffffc5a0) at /usr/include/c++/9/bits/stl_vector.h:351
#8  std::_Vector_base<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::~_Vector_base (this=0x7fffffffc5a0,
    __in_chrg=<optimized out>) at /usr/include/c++/9/bits/stl_vector.h:332
#9  std::vector<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*, std::allocator<std::vector<inst_trace_t, std::allocator<inst_trace_t> >*> >::~vector (this=0x7fffffffc5a0,
    __in_chrg=<optimized out>) at /usr/include/c++/9/bits/stl_vector.h:680
#10 trace_shader_core_ctx::init_traces (this=this@entry=0x55555696fb20, start_warp=0,
    end_warp=end_warp@entry=4, kernel=...) at trace_driven.cc:487




The code looks like


  std::vector<std::vector<inst_trace_t> *> threadblock_traces;
  for (unsigned i = start_warp; i < end_warp; ++i) {
    trace_shd_warp_t *m_trace_warp = static_cast<trace_shd_warp_t *>(m_warp[i]);
    m_trace_warp->clear();
    threadblock_traces.push_back(&(m_trace_warp->warp_traces));
  }
  trace_kernel_info_t &trace_kernel =
      static_cast<trace_kernel_info_t &>(kernel);
  printf("hello %d\n",threadblock_traces.size());
  trace_kernel.get_next_threadblock_traces(threadblock_traces);



Any feedback is appreciated.



Regards,
Mahmood







On Friday, July 9, 2021, 9:57:26 AM GMT+2, Xi Ruoyao <xry111@mengyan1223.wang> wrote: 





On Thu, 2021-07-08 at 16:42 +0000, Mahmood N via Gcc-help wrote:
> Hi
> 
> I see that my program works fine in the debug mode, but not in the
> release mode. With GDB I was able to find the function that got error.
> The code looks like
> 
> 
> 
>   std::vector<std::vector<inst_trace_t> *> threadblock_traces;
>   ...
>   printf("hello %d\n",threadblock_traces.size());

At least you can't use "%d" to print the result of
std::vector<T>::size() (which is not an `int`).


>   trace_kernel.get_next_threadblock_traces(threadblock_traces);

> 

Try to find the bug of your code with warnings, sanitizers, or
-D_GLIBCXX_DEBUG.

-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
  2021-07-10 17:14       ` Mahmood Naderan
@ 2021-07-10 18:16         ` Xi Ruoyao
  2021-07-11  8:05           ` Mahmood Naderan
  0 siblings, 1 reply; 6+ messages in thread
From: Xi Ruoyao @ 2021-07-10 18:16 UTC (permalink / raw)
  To: Mahmood Naderan, gcc-help

On Sat, 2021-07-10 at 17:14 +0000, Mahmood Naderan via Gcc-help wrote:
> OK. I tried to do some more tricks and got new signs of crash.
> I edited the Makefile in the folder that trace_driven.cc is compiled
> and instead of using -O3, I used -O1 and now I get 
> 
> 
> free(): double free detected in tcache 2

It indicates a bug in your code (in 99.99% chance).

Even if you paid money for a commercial compiler, its supporting team
wouldn't help you to debug your code.

>   std::vector<std::vector<inst_trace_t> *> threadblock_traces;
>   for (unsigned i = start_warp; i < end_warp; ++i) {
>     trace_shd_warp_t *m_trace_warp = static_cast<trace_shd_warp_t
> *>(m_warp[i]);
>     m_trace_warp->clear();
>     threadblock_traces.push_back(&(m_trace_warp->warp_traces));
>   }
>   trace_kernel_info_t &trace_kernel =
>       static_cast<trace_kernel_info_t &>(kernel);
>   printf("hello %d\n",threadblock_traces.size());
>   trace_kernel.get_next_threadblock_traces(threadblock_traces);

There is no way to determine if this snippnet is correct, as its
behavior depends on what "m_trace_wrap" is, what "clear" does, etc.

And you are still using %d for size_t, which is an undefined behavior. 
It seems you don't know what undefined behavior is, so it's very likely
there are more cases of undefined behavior in your code.
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __memmove_avx_unaligned_erms throws segmentation fault in release mode
  2021-07-10 18:16         ` Xi Ruoyao
@ 2021-07-11  8:05           ` Mahmood Naderan
  0 siblings, 0 replies; 6+ messages in thread
From: Mahmood Naderan @ 2021-07-11  8:05 UTC (permalink / raw)
  To: gcc-help

>Even if you paid money for a commercial compiler, its supporting team
>wouldn't help you to debug your code.


I just wanted to see if that STL vector error has been reported or if there is any hint to find the bug that why debug mode works but not -O1 or -O2 or -O3.



Regards,
Mahmood






^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-07-11  8:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <343253870.2815115.1625762524578.ref@mail.yahoo.com>
2021-07-08 16:42 ` __memmove_avx_unaligned_erms throws segmentation fault in release mode Mahmood N
2021-07-09  7:57   ` Xi Ruoyao
2021-07-10 16:18     ` Mahmood Naderan
2021-07-10 17:14       ` Mahmood Naderan
2021-07-10 18:16         ` Xi Ruoyao
2021-07-11  8:05           ` Mahmood Naderan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).