* [Bug sanitizer/59286] segfault in __sanitizer::StackDepotGet
2013-11-25 15:13 [Bug sanitizer/59286] New: segfault in __sanitizer::StackDepotGet Joost.VandeVondele at mat dot ethz.ch
@ 2013-11-26 6:58 ` Joost.VandeVondele at mat dot ethz.ch
2013-11-26 7:12 ` kcc at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-11-26 6:58 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59286
Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |Joost.VandeVondele at mat dot ethz
| |.ch
--- Comment #2 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
(In reply to Kostya Serebryany from comment #1)
> does this happen with clang's tsan?
This is a Fortran package, so trying clang is not really an option.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug sanitizer/59286] segfault in __sanitizer::StackDepotGet
2013-11-25 15:13 [Bug sanitizer/59286] New: segfault in __sanitizer::StackDepotGet Joost.VandeVondele at mat dot ethz.ch
2013-11-26 6:58 ` [Bug sanitizer/59286] " Joost.VandeVondele at mat dot ethz.ch
@ 2013-11-26 7:12 ` kcc at gcc dot gnu.org
2013-11-26 7:25 ` Joost.VandeVondele at mat dot ethz.ch
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: kcc at gcc dot gnu.org @ 2013-11-26 7:12 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59286
--- Comment #3 from Kostya Serebryany <kcc at gcc dot gnu.org> ---
Can you post the exact command line to reproduce the failure?
(We should have CP2K sources somewhere)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug sanitizer/59286] segfault in __sanitizer::StackDepotGet
2013-11-25 15:13 [Bug sanitizer/59286] New: segfault in __sanitizer::StackDepotGet Joost.VandeVondele at mat dot ethz.ch
2013-11-26 6:58 ` [Bug sanitizer/59286] " Joost.VandeVondele at mat dot ethz.ch
2013-11-26 7:12 ` kcc at gcc dot gnu.org
@ 2013-11-26 7:25 ` Joost.VandeVondele at mat dot ethz.ch
2013-11-26 10:44 ` kcc at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-11-26 7:25 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59286
--- Comment #4 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
(In reply to Kostya Serebryany from comment #3)
> Can you post the exact command line to reproduce the failure?
> (We should have CP2K sources somewhere)
The exact command line is rather easy:
cd cp2k/tests/QS/regtest-hybrid-1
export OMP_NUM_THREADS=4 ; ../../../exe/test_tsan/cp2k.ssmp H2O-hybrid-pbe0.inp
However, building a tsan instrumented CP2K is non-trivial, as it requires
libgomp to be built with tsan (see
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55561#c15 for a howto), and some of
the dependent libraries must be built with tsan as well. I'm happy to help, but
could take some time. Should I post detailed instructions?
Meanwhile another CP2K input fails with:
Unexpected mmap in InternalAllocator!
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug sanitizer/59286] segfault in __sanitizer::StackDepotGet
2013-11-25 15:13 [Bug sanitizer/59286] New: segfault in __sanitizer::StackDepotGet Joost.VandeVondele at mat dot ethz.ch
` (2 preceding siblings ...)
2013-11-26 7:25 ` Joost.VandeVondele at mat dot ethz.ch
@ 2013-11-26 10:44 ` kcc at gcc dot gnu.org
2013-11-26 11:57 ` Joost.VandeVondele at mat dot ethz.ch
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: kcc at gcc dot gnu.org @ 2013-11-26 10:44 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59286
--- Comment #5 from Kostya Serebryany <kcc at gcc dot gnu.org> ---
> However, building a tsan instrumented CP2K is non-trivial, as it requires
Maybe let's do some remote debugging then :)
The crash looks like someone corrupted the internal tsan's memory.
Can you insert some Printf statements in sanitizer_stackdepot.cc
to see how we get this crazy pointer value: 0x4d634810890c5593
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug sanitizer/59286] segfault in __sanitizer::StackDepotGet
2013-11-25 15:13 [Bug sanitizer/59286] New: segfault in __sanitizer::StackDepotGet Joost.VandeVondele at mat dot ethz.ch
` (3 preceding siblings ...)
2013-11-26 10:44 ` kcc at gcc dot gnu.org
@ 2013-11-26 11:57 ` Joost.VandeVondele at mat dot ethz.ch
2013-11-26 13:36 ` kcc at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-11-26 11:57 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59286
--- Comment #6 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
(In reply to Kostya Serebryany from comment #5)
> > However, building a tsan instrumented CP2K is non-trivial, as it requires
>
> Maybe let's do some remote debugging then :)
We can give this a try, but I might need rather detailed instructions. I.e.
starting from a sample printf statement and a suggested line number. With my
current setup, I get the segfault in a new location:
Program received signal SIGSEGV, Segmentation fault.
__sanitizer::find (s=0xffffffffffffffff, s@entry=0x7ffff4c2e8b8,
stack=stack@entry=0x7ffff4d797b8, size=size@entry=19,
hash=hash@entry=1116003544)
at
../../../../gcc/libsanitizer/sanitizer_common/sanitizer_stackdepot.cc:109
109 if (s->hash == hash && s->size == size) {
(gdb) bt 7
#0 __sanitizer::find (s=0xffffffffffffffff, s@entry=0x7ffff4c2e8b8,
stack=stack@entry=0x7ffff4d797b8, size=size@entry=19,
hash=hash@entry=1116003544)
at
../../../../gcc/libsanitizer/sanitizer_common/sanitizer_stackdepot.cc:109
#1 0x00007ffff35ae1a1 in __sanitizer::StackDepotPut (stack=0x7ffff4d797b8,
size=19)
at
../../../../gcc/libsanitizer/sanitizer_common/sanitizer_stackdepot.cc:150
#2 0x00007ffff3570e4d in __tsan::CurrentStackId (thr=0xffffffffffffffff,
pc=140737301157816) at ../../../../gcc/libsanitizer/tsan/tsan_rtl.cc:305
#3 0x00007ffff35a1404 in __tsan::user_alloc (thr=0x7ffff4d79780,
pc=140737276073318, sz=24, align=<optimized out>)
at ../../../../gcc/libsanitizer/tsan/tsan_mman.cc:110
#4 0x00007ffff358d58c in __interceptor_malloc (size=24) at
../../../../gcc/libsanitizer/tsan/tsan_interceptors.cc:447
#5 0x00007ffff57ba78c in timings::timestop (handle=339) at
/data/vjoost/clean/cp2k/cp2k/src/../src/timings.F:328
#6 0x00007ffff76accf6 in dbcsr_error_handling::dbcsr_error_stop (handler=1,
error=...)
at
/data/vjoost/clean/cp2k/cp2k/src/../src/dbcsr_lib/dbcsr_error_handling.F:180
(More stack frames follow...)
(gdb) print s
$3 = (__sanitizer::StackDesc *) 0xffffffffffffffff
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug sanitizer/59286] segfault in __sanitizer::StackDepotGet
2013-11-25 15:13 [Bug sanitizer/59286] New: segfault in __sanitizer::StackDepotGet Joost.VandeVondele at mat dot ethz.ch
` (4 preceding siblings ...)
2013-11-26 11:57 ` Joost.VandeVondele at mat dot ethz.ch
@ 2013-11-26 13:36 ` kcc at gcc dot gnu.org
2013-11-26 13:48 ` Joost.VandeVondele at mat dot ethz.ch
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: kcc at gcc dot gnu.org @ 2013-11-26 13:36 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59286
--- Comment #8 from Kostya Serebryany <kcc at gcc dot gnu.org> ---
Just insert more printfs everywhere you can :)
E.g. print everything around "s->link = s2" in StackDepotPut
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug sanitizer/59286] segfault in __sanitizer::StackDepotGet
2013-11-25 15:13 [Bug sanitizer/59286] New: segfault in __sanitizer::StackDepotGet Joost.VandeVondele at mat dot ethz.ch
` (5 preceding siblings ...)
2013-11-26 13:36 ` kcc at gcc dot gnu.org
@ 2013-11-26 13:48 ` Joost.VandeVondele at mat dot ethz.ch
2013-11-26 15:06 ` Joost.VandeVondele at mat dot ethz.ch
2024-03-16 21:29 ` pinskia at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-11-26 13:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59286
--- Comment #9 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
(In reply to Kostya Serebryany from comment #8)
> Just insert more printfs everywhere you can :)
> E.g. print everything around "s->link = s2" in StackDepotPut
hmm I can write a lot of printfs, but it is not very targetted..
However, I think I got a little further. For this kind of crash:
Getting 0x7fffed22e328
Following 0x7ffff04b8a80
Following 0x40027bd6cd50653b
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffed2f5100 (LWP 9760)]
0x00007fffebbae530 in __sanitizer::StackDepotGet (id=3978818064, size=0x0)
at
../../../../gcc/libsanitizer/sanitizer_common/sanitizer_stackdepot.cc:197
197 if (s->id == id) {
(gdb) print s
$3 = (__sanitizer::StackDesc *) 0x40027bd6cd50653b
I have put a hardware breakpoint on this field
break __sanitizer::StackDepotGet
awatch ((StackDesc*)0x7ffff04b8a80)->link
(which is the link that gets corrupted).
This breakpoint gets activated from CP2K at:
[Switching to Thread 0x7fffed3ec100 (LWP 9804)]
Hardware access (read/write) watchpoint 13: ((StackDesc*)0x7ffff04b8a80)->link
Value = (PTR TO -> ( __sanitizer::StackDesc )) 0x40027bd6cd50653b
0x00007fffee8811fe in hfx_load_balance_methods::estimate_basic (p=...)
at /data/vjoost/clean/cp2k/cp2k/src/../src/hfx_load_balance_methods.F:1829
1829 p1=p(1) ; p2=p(2) ; p3=p(3) ; p4=p(4)
(gdb) bt
#0 0x00007fffee8811fe in hfx_load_balance_methods::estimate_basic (p=...)
at /data/vjoost/clean/cp2k/cp2k/src/../src/hfx_load_balance_methods.F:1829
#1 0x00007fffee881020 in hfx_load_balance_methods::cost_model (nsa=1, nsb=1,
nsc=1, nsd=1, npgfa=6, npgfb=6,
npgfc=6, npgfd=6, ratio=-0.3026277383289448, p1=..., p2=..., p3=...)
at /data/vjoost/clean/cp2k/cp2k/src/../src/hfx_load_balance_methods.F:1817
#2 0x00007fffee87ee8c in hfx_load_balance_methods::estimate_block_cost
(natom=3, nkind=2, list_ij=...,
list_kl=..., set_list_ij=..., set_list_kl=..., iatom_start=1, iatom_end=1,
jatom_start=1, jatom_end=1,
katom_start=1, katom_end=1, latom_start=1, latom_end=1, particle_set=...,
coeffs_set=..., coeffs_kind=...,
is_assoc_atomic_block_global=..., do_periodic=.FALSE., kind_of=...,
basis_parameter=..., pmax_set=...,
pmax_atom=..., pmax_blocks=0, cell=0x7d3000012d80, do_p_screening=.FALSE.,
map_atom_to_kind_atom=...,
eval_type=1, log10_eps_schwarz=-10, log_2=0.3010299956639812,
coeffs_kind_max0=1.1049525569372649,
use_virial=.FALSE., atomic_pair_list=...)
at /data/vjoost/clean/cp2k/cp2k/src/../src/hfx_load_balance_methods.F:2212
This is the 'correct' place for corruption, as this routine is only called for
those runs that segfault.
Potentially interesting is that this is also a routine that is somewhat special
in Fortran, i.e. a contained subroutine, which presumably is treated somewhat
special by the compiler (not sure about the C-like equivalent, maybe nested
functions or so ?)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug sanitizer/59286] segfault in __sanitizer::StackDepotGet
2013-11-25 15:13 [Bug sanitizer/59286] New: segfault in __sanitizer::StackDepotGet Joost.VandeVondele at mat dot ethz.ch
` (6 preceding siblings ...)
2013-11-26 13:48 ` Joost.VandeVondele at mat dot ethz.ch
@ 2013-11-26 15:06 ` Joost.VandeVondele at mat dot ethz.ch
2024-03-16 21:29 ` pinskia at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-11-26 15:06 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59286
--- Comment #10 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
well, maybe a more simple reason. If I export
export OMP_STACKSIZE=32M
(i.e. stack size for the threads), the segfault disappears... does that sound
like a good reason (i.e. tsan instrumented binary might require more stack), or
does this seem just lucky coincidence ?
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug sanitizer/59286] segfault in __sanitizer::StackDepotGet
2013-11-25 15:13 [Bug sanitizer/59286] New: segfault in __sanitizer::StackDepotGet Joost.VandeVondele at mat dot ethz.ch
` (7 preceding siblings ...)
2013-11-26 15:06 ` Joost.VandeVondele at mat dot ethz.ch
@ 2024-03-16 21:29 ` pinskia at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-16 21:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59286
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
Status|UNCONFIRMED |RESOLVED
--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Invalid as mentioned, stack size increases slightly with tsan enabled which
causes the needed increase in stack size overall.
^ permalink raw reply [flat|nested] 10+ messages in thread