public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-27 11:17 Mark Wielaard
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2014-11-27 11:17 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 3055 bytes --]

On Wed, 2014-11-26 at 15:44 +0100, Jan Kratochvil wrote:
> On Thu, 30 Oct 2014 22:55:41 +0100, Mark Wielaard wrote:
> > I did have to lookup how to use this though. And I couldn't make it work
> > without adding an -o option to eu-stack to explicitly redirect output
> > since it seems you cannot use normal shell redirections
> > in /proc/sys/kernel/core_pattern. So I had to use something like
> > --output=/proc/%p/cwd/%i.stack. Where does the stdout of the
> > core_pattern end up otherwise?
> 
> Nowhere as the only open fd is 0 - the piped core file:
> lr-x------ 1 root root 64 Nov 26 15:23 0 -> pipe:[5401953]
> 
> I did not expect (and did not use) eu-stack directly in
> /proc/sys/kernel/core_pattern .  I put there just a shell script which ended
> with 'exec eu-stack ...'.  This is why I have put into --help the comment:
> 	core file fd 0 must be owned only by this process (e.g. use shell exec)
> 
> 
> > It would probably be good to have a full example usage in the --help
> > message (and tell people to look at man 5 core).
> 
> One should IMO put into that shell script dumping of more associated
> information, such as /proc/PID/fd/ and other info found in ABRT, it needs to
> find some proper appendable output file and/or unique directory etc.  I do not
> find it all suitable for eu-stack --help.  That would be possibly appropriate
> for a man page but elfutils has no man pages.

The --help output is the best we have for now. So best to put
documentation there IMHO. But yes, having a real manual, man or info
pages for all elfutils tools would be better.

If we include the option/code then it really should be usable by users.
It looked usable as is with the addition of -o,--output. But I might be
missing something subtle.

> > > The code of core_pattern function has been suggested by Oleg Nesterov.
> > 
> > I was wondering if we could make this a little more generically usable
> > and call it --wait-exit or something like that. So people could also use
> > it outside the core_pattern if they just want to get a backtrace for a
> > known thread when it exits.
> 
> I was trying to figure out some more generic functionality than --core-pattern
> but given that for core_pattern one needs to:
> 	PTRACE_SEIZE
> 	close(0)
> 	waitpid()
> I do not find any part of this sequence separable, do you?

The close (0) is somewhat odd. I assume it signals the process is done
with handling the core file data and waitpid () never returns without
that?

But it does seem somewhat harmless. Are there any bad side effects from
closing stdin? We don't handle stdin at the moment. Although we could
maybe add --core=- to be explicit about the core data coming from stdin.

The waitpid () really is just waiting for PTRACE_EVENT_EXIT. So if you
wrap that in a while loop wouldn't it just work fine too in case the
user uses it with -p <tid> on a thread that they want to get the
backtrace of when it exits instead of in a core file pattern handler
pipe?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-28 15:04 Mark Wielaard
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2014-11-28 15:04 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1260 bytes --]

On Thu, 2014-11-27 at 22:47 +0100, Jan Kratochvil wrote:
> On Thu, 27 Nov 2014 12:17:39 +0100, Mark Wielaard wrote:
> > The waitpid () really is just waiting for PTRACE_EVENT_EXIT. So if you
> > wrap that in a while loop wouldn't it just work fine too in case the
> > user uses it with -p <tid> on a thread that they want to get the
> > backtrace of when it exits instead of in a core file pattern handler
> > pipe?
> 
> Which way?

Something like the following quick hack seems to work:

  int status;
  do
    {
      pid_t got = waitpid (pid, &status, 0);
      if (got == -1)
       error (EXIT_BAD, errno, "waitpid ()");
      if (got != pid)
       error (EXIT_BAD, 0, "waitpid () returned %d but %d was expected",
              got, pid);
    }
  while (! WIFSTOPPED (status)
        && (status >> 8) != (SIGTRAP | (PTRACE_EVENT_EXIT << 8)));

Then I can point eu-stack at some thread and get a backtrace when it
exits. I am sure there is some more error handling to do. And I might
have some of corner cases. But that seems to be the basic idea to get
things to work outside the core_pattern filter. And IMHO it is an
interesting feature if you just want to know when and how a running
thread exits.

Cheers,

Mark


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-28 14:44 Mark Wielaard
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2014-11-28 14:44 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1597 bytes --]

On Thu, 2014-11-27 at 22:41 +0100, Jan Kratochvil wrote:
> I just find it improbable one would not need a single other kind of
> information than what eu-stack can provide.

FWIW I found it useful. All other core_pattern programs I have seen seem
somewhat bloated and I normally deinstall them. It is nice to have a
small light-weight and quick core_pattern filter that just dumps the
exiting/crashing backtrace in a file and does nothing else. IMHO.

I don't think any distro would install it as standard core_pattern
filter, they probably go with abrt or systemd-coredump to get all the
fancy output. But for a hacker box it seems a nice thing to have around.

> > But if --core-pattern isn't useful without other code, then we should
> > include that code too.
> 
> Yes, there could be provided some "contrib"-like core dumping shell script.
> 
> 
> > Could you post what you have?
> 
> Nothing interesting, only for debugging eu-stack --core-pattern, when you ask:
> 	#! /bin/bash
> 	exec >>/tmp/out 2>>/tmp/out
> 	date --iso=seconds
> 	echo "$*"
> 	cd /home/jkratoch/redhat/elfutils
> 	. /home/jkratoch/t/elfutils-boot
> 	ldd ./src/stack
> 	# It locks up as strace holds fd 0:
> 	#strace -s200 -o /tmp/out.strace -q ./src/stack --core-pattern -1 --pid=$1
> 	exec ./src/stack --core-pattern -1 -l -m --pid=$1

So the missing functionality really is just redirecting the output (and
stderr)? What is in elfutils-boot?

BTW I was just using:
/usr/bin/eu-stack --core-pattern -v --p %i -o /proc/%p/cwd/%i.stack
without any shell wrapper.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-28 14:27 Mark Wielaard
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2014-11-28 14:27 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 3429 bytes --]

On Thu, 2014-11-27 at 22:27 +0100, Jan Kratochvil wrote:
> On Thu, 27 Nov 2014 12:04:39 +0100, Mark Wielaard wrote:
> > On Wed, 2014-11-26 at 15:32 +0100, Jan Kratochvil wrote:
> > > Therefore they are dead at the core_pattern time, they cannot be ptraced and
> > > therefore they cannot be unwound. One could only find them in the core file
> > > itself but that is outside of the scope of this eu-stack feature.
> > 
> > Why do you think that is out of scope? We get both the actual tid
> > through the command line option and the core file through stdin. Can't
> > we combine those two to get all information we want/need?
> 
> It could be extended that way but:
> 
> Currently elfutils is used only for the list of unwound PCs.  There is missing
> a big part of functionality present in GDB - displaying parameters, local
> variables, entry-value recovered values, data types pretty printing etc.
> And I have doubts unwind of non-crashed threads is useful at all.

But that not what I suggested. Although I think it would be cool to
extend eu-stack to also show arguments, locals, etc. That is an
interesting project on its own separate from this work.

> It is a new functionality on top of this patch that it has to read thread
> registers from PT_NOTE but access the memory via ptraced crashed thread.
> Fortunately core file PT_NOTE is first so it can be read quickly:
> Program Headers:
>   Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
>   NOTE           0x0004a0 0x0000000000000000 0x0000000000000000 0x0005f4 0x000000     0
> As reading the whole core file is in many times not possible (without some
> better kernel core dumping interface):
> 	-fsanitize=address locks up abrt-hook-ccpp
> 	https://bugzilla.redhat.com/show_bug.cgi?id=1164548

Is this bug being hit by elfutils/libdwfl core file parsing or a
specific issue with how abrt-hook-ccpp parses its input? I agree a nicer
interface for /proc/sys/kernel/core_pattern would be appreciated. But
that seems a kernel issue. We have to work with what we have now.

systemd has a standard core_pattern hook systemd-coredump (that also
uses elfutils/libdwfl to extract backtraces). Maybe we can work with
them on a interface/daemon to provide access to client programs that
wish to inspect the process/core? But again I think that is separate
from what we are currently working on.

> So I agree your proposal makes some sense but:
>  * I find it an incremental feature on top of this one.
>  * Given elfutils just lists the unwound PCs I do not see how it is useful.
> 
> What is your - or Martin's - opinion on such feature given these contraints?

I think getting backtraces for one or more threads of a crashing/exiting
process is an nice feature. My point is simply that I think that what
you wrote is a nice feature that we need to make into something that is
really usable. Whether that is by extending the support to something
more generic like --wait-pid that can be used independent of the
core_pattern filter or in providing in eu-stack all functionality
expected from a core_pattern filter program so that it is just a drop in
filter program doesn't really matter to me. All I am interested in is
that the eu-stack feature will actually be used and is not just some
example code that isn't really usable or used by actual users.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-27 21:47 Jan Kratochvil
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kratochvil @ 2014-11-27 21:47 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1497 bytes --]

On Thu, 27 Nov 2014 12:17:39 +0100, Mark Wielaard wrote:
> On Wed, 2014-11-26 at 15:44 +0100, Jan Kratochvil wrote:
> > I was trying to figure out some more generic functionality than --core-pattern
> > but given that for core_pattern one needs to:
> > 	PTRACE_SEIZE
> > 	close(0)
> > 	waitpid()
> > I do not find any part of this sequence separable, do you?
> 
> The close (0) is somewhat odd. I assume it signals the process is done
> with handling the core file data and waitpid () never returns without
> that?

Yes.


> But it does seem somewhat harmless. Are there any bad side effects from
> closing stdin? We don't handle stdin at the moment. Although we could
> maybe add --core=- to be explicit about the core data coming from stdin.

I have discussed reading core file for --core-pattern in my other mail.


> The waitpid () really is just waiting for PTRACE_EVENT_EXIT. So if you
> wrap that in a while loop wouldn't it just work fine too in case the
> user uses it with -p <tid> on a thread that they want to get the
> backtrace of when it exits instead of in a core file pattern handler
> pipe?

Which way?
#1
	PTRACE_SEIZE
	waitpid()
	 = lock-up on waitpid()
#2
	close(0); (in a process exec-ing eu-readelf)
	PTRACE_SEIZE
	 = ESRCH; not tested but I find it obvious, the process has disappeared
	waitpid()
	 = ECHILD; not tested but obviously there is no ptraced process

I do not see where a while loop could be applied.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-27 21:41 Jan Kratochvil
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kratochvil @ 2014-11-27 21:41 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]

On Thu, 27 Nov 2014 12:05:51 +0100, Mark Wielaard wrote:
> We have -o,--output support in other utilities.

Those are ld, strip and unstrip.  That is not for text output but for the
resulting ELF files which is IMO a different case.

BTW I am not against it, I just find it redundant.  If thiAny program could
have -o|--output but in UNIX it is solved by the more general shell '>'.


> But maybe it isn't
> useful in general and people will always use shell redirection?

Currently it uses fopen "wx" which requires unique filename for each crash
dump unwind.  One could use %p in the filename but that is also not completely
unique.  Besides that one should IMO provide more information from /proc/PID/
(like fd/ ) than just the backtrace eu-stack can provide.  One is also
interested in /proc/PID/exe which is not obvious from -l or -m (at least for
PIEs); although one could possibly use %E for core_pattern.

I just find it improbable one would not need a single other kind of
information than what eu-stack can provide.


> But if --core-pattern isn't useful without other code, then we should
> include that code too.

Yes, there could be provided some "contrib"-like core dumping shell script.


> Could you post what you have?

Nothing interesting, only for debugging eu-stack --core-pattern, when you ask:
	#! /bin/bash
	exec >>/tmp/out 2>>/tmp/out
	date --iso=seconds
	echo "$*"
	cd /home/jkratoch/redhat/elfutils
	. /home/jkratoch/t/elfutils-boot
	ldd ./src/stack
	# It locks up as strace holds fd 0:
	#strace -s200 -o /tmp/out.strace -q ./src/stack --core-pattern -1 --pid=$1
	exec ./src/stack --core-pattern -1 -l -m --pid=$1


Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-27 21:27 Jan Kratochvil
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kratochvil @ 2014-11-27 21:27 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1811 bytes --]

On Thu, 27 Nov 2014 12:04:39 +0100, Mark Wielaard wrote:
> On Wed, 2014-11-26 at 15:32 +0100, Jan Kratochvil wrote:
> > Therefore they are dead at the core_pattern time, they cannot be ptraced and
> > therefore they cannot be unwound. One could only find them in the core file
> > itself but that is outside of the scope of this eu-stack feature.
> 
> Why do you think that is out of scope? We get both the actual tid
> through the command line option and the core file through stdin. Can't
> we combine those two to get all information we want/need?

It could be extended that way but:

Currently elfutils is used only for the list of unwound PCs.  There is missing
a big part of functionality present in GDB - displaying parameters, local
variables, entry-value recovered values, data types pretty printing etc.
And I have doubts unwind of non-crashed threads is useful at all.

It is a new functionality on top of this patch that it has to read thread
registers from PT_NOTE but access the memory via ptraced crashed thread.
Fortunately core file PT_NOTE is first so it can be read quickly:
Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  NOTE           0x0004a0 0x0000000000000000 0x0000000000000000 0x0005f4 0x000000     0
As reading the whole core file is in many times not possible (without some
better kernel core dumping interface):
	-fsanitize=address locks up abrt-hook-ccpp
	https://bugzilla.redhat.com/show_bug.cgi?id=1164548

So I agree your proposal makes some sense but:
 * I find it an incremental feature on top of this one.
 * Given elfutils just lists the unwound PCs I do not see how it is useful.

What is your - or Martin's - opinion on such feature given these contraints?


Thanks,
Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-27 11:05 Mark Wielaard
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2014-11-27 11:05 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Wed, 2014-11-26 at 15:47 +0100, Jan Kratochvil wrote:
> On Mon, 03 Nov 2014 14:22:23 +0100, Mark Wielaard wrote:
> > For reference. Attached is the cleaned up -o support patch I added.
> 
> As I wrote I find that unrelated to --core-pattern which should IMO use
> a shell script (or other language) wrapper.

We have -o,--output support in other utilities. But maybe it isn't
useful in general and people will always use shell redirection? But if
--core-pattern isn't useful without other code, then we should include
that code too. Could you post what you have?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-27 11:04 Mark Wielaard
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2014-11-27 11:04 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1325 bytes --]

On Wed, 2014-11-26 at 15:32 +0100, Jan Kratochvil wrote:
> On Thu, 30 Oct 2014 22:55:41 +0100, Mark Wielaard wrote:
> > On Thu, 2014-10-09 at 23:25 +0200, Jan Kratochvil wrote:
> > > +      if (opt_core_pattern == true && show_one_tid == false)
> > > +	argp_error (state,
> > > +		    N_("--core-pattern requires -1"));
> > 
> > Why this restriction?
> 
> I was blindly following Oleg's note which was not so obvious to me, though:
> 
> On Wed, 03 Sep 2014 16:26:41 +0200, Oleg Nesterov wrote:
> # Obviously, this way you can only inspect the thread which dumps the core.
> 
> Therefore I have now tried to remove this limitation of
>   -1                         Show the backtrace of only one thread
> 
> But I have found the other threads end up with:
> 	wait4(13902, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], __WALL, NULL) = 13902
> 
> Therefore they are dead at the core_pattern time, they cannot be ptraced and
> therefore they cannot be unwound. One could only find them in the core file
> itself but that is outside of the scope of this eu-stack feature.

Why do you think that is out of scope? We get both the actual tid
through the command line option and the core file through stdin. Can't
we combine those two to get all information we want/need?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-26 20:57 Oleg Nesterov
  0 siblings, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2014-11-26 20:57 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 563 bytes --]

On 11/26, Jan Kratochvil wrote:
>
> On Wed, 26 Nov 2014 21:08:22 +0100, Oleg Nesterov wrote:
> > Yes. I guess PTRACE_ATTACH should work but it is pointless, this
> > thread can't report an event and other ptrace requests won't work.
> >
> > In fact I think PTRACE_ATTACH should fail in this case...
>
> I have used PTRACE_SEIZE instead:
>
> ptrace(PTRACE_SEIZE, 13902, 0, 0)       = 0

Sorry for confusion.

I meant, it would be better if PTRACE_ATTACH/PTRACE_SEIZE failed in
this case. But it is too late to change this old behaviour.

Oleg.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-26 20:26 Jan Kratochvil
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kratochvil @ 2014-11-26 20:26 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 518 bytes --]

On Wed, 26 Nov 2014 21:08:22 +0100, Oleg Nesterov wrote:
> Yes. I guess PTRACE_ATTACH should work but it is pointless, this
> thread can't report an event and other ptrace requests won't work.
> 
> In fact I think PTRACE_ATTACH should fail in this case...

I have used PTRACE_SEIZE instead:

ptrace(PTRACE_SEIZE, 13902, 0, 0)       = 0
ptrace(PTRACE_INTERRUPT, 13902, 0, 0)   = 0
 - here close(0);
wait4(13902, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], __WALL, NULL) = 13902



Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-26 20:08 Oleg Nesterov
  0 siblings, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2014-11-26 20:08 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1175 bytes --]

On 11/26, Jan Kratochvil wrote:
>
> On Thu, 30 Oct 2014 22:55:41 +0100, Mark Wielaard wrote:
> > On Thu, 2014-10-09 at 23:25 +0200, Jan Kratochvil wrote:
> > > +      if (opt_core_pattern == true && show_one_tid == false)
> > > +	argp_error (state,
> > > +		    N_("--core-pattern requires -1"));
> >
> > Why this restriction?
>
> I was blindly following Oleg's note which was not so obvious to me, though:
>
> On Wed, 03 Sep 2014 16:26:41 +0200, Oleg Nesterov wrote:
> # Obviously, this way you can only inspect the thread which dumps the core.
>
> Therefore I have now tried to remove this limitation of
>   -1                         Show the backtrace of only one thread
>
> But I have found the other threads end up with:
> 	wait4(13902, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], __WALL, NULL) = 13902
>
> Therefore they are dead at the core_pattern time, they cannot be ptraced and
> therefore they cannot be unwound.

Yes. I guess PTRACE_ATTACH should work but it is pointless, this
thread can't report an event and other ptrace requests won't work.

In fact I think PTRACE_ATTACH should fail in this case...

Oleg.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-26 14:47 Jan Kratochvil
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kratochvil @ 2014-11-26 14:47 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 464 bytes --]

On Mon, 03 Nov 2014 14:22:23 +0100, Mark Wielaard wrote:
> For reference. Attached is the cleaned up -o support patch I added.

As I wrote I find that unrelated to --core-pattern which should IMO use
a shell script (or other language) wrapper.

Additionally the eu-stack patch may be also considered just as an example code
how to implement that functionality into your app (like what ABRT is
implementing, independently from eu-stack).


Thanks,
Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-26 14:44 Jan Kratochvil
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kratochvil @ 2014-11-26 14:44 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1906 bytes --]

On Thu, 30 Oct 2014 22:55:41 +0100, Mark Wielaard wrote:
> I did have to lookup how to use this though. And I couldn't make it work
> without adding an -o option to eu-stack to explicitly redirect output
> since it seems you cannot use normal shell redirections
> in /proc/sys/kernel/core_pattern. So I had to use something like
> --output=/proc/%p/cwd/%i.stack. Where does the stdout of the
> core_pattern end up otherwise?

Nowhere as the only open fd is 0 - the piped core file:
lr-x------ 1 root root 64 Nov 26 15:23 0 -> pipe:[5401953]

I did not expect (and did not use) eu-stack directly in
/proc/sys/kernel/core_pattern .  I put there just a shell script which ended
with 'exec eu-stack ...'.  This is why I have put into --help the comment:
	core file fd 0 must be owned only by this process (e.g. use shell exec)


> It would probably be good to have a full example usage in the --help
> message (and tell people to look at man 5 core).

One should IMO put into that shell script dumping of more associated
information, such as /proc/PID/fd/ and other info found in ABRT, it needs to
find some proper appendable output file and/or unique directory etc.  I do not
find it all suitable for eu-stack --help.  That would be possibly appropriate
for a man page but elfutils has no man pages.


> > The code of core_pattern function has been suggested by Oleg Nesterov.
> 
> I was wondering if we could make this a little more generically usable
> and call it --wait-exit or something like that. So people could also use
> it outside the core_pattern if they just want to get a backtrace for a
> known thread when it exits.

I was trying to figure out some more generic functionality than --core-pattern
but given that for core_pattern one needs to:
	PTRACE_SEIZE
	close(0)
	waitpid()
I do not find any part of this sequence separable, do you?


Thanks,
Jan

[-- Attachment #2: attachment.mht --]
[-- Type: message/rfc822, Size: 47 bytes --]


WzxlbWFpbC5tZXNzYWdlLk1lc3NhZ2UgaW5zdGFuY2UgYXQgMHgxNTM5MjAwPl0=

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-26 14:32 Jan Kratochvil
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kratochvil @ 2014-11-26 14:32 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 5281 bytes --]

On Thu, 30 Oct 2014 22:55:41 +0100, Mark Wielaard wrote:
> On Thu, 2014-10-09 at 23:25 +0200, Jan Kratochvil wrote:
> > +      if (opt_core_pattern == true && show_one_tid == false)
> > +	argp_error (state,
> > +		    N_("--core-pattern requires -1"));
> 
> Why this restriction?

I was blindly following Oleg's note which was not so obvious to me, though:

On Wed, 03 Sep 2014 16:26:41 +0200, Oleg Nesterov wrote:
# Obviously, this way you can only inspect the thread which dumps the core.

Therefore I have now tried to remove this limitation of
  -1                         Show the backtrace of only one thread

But I have found the other threads end up with:
	wait4(13902, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], __WALL, NULL) = 13902

Therefore they are dead at the core_pattern time, they cannot be ptraced and
therefore they cannot be unwound.  One could only find them in the core file
itself but that is outside of the scope of this eu-stack feature.


Jan

diff --git a/src/stack.c b/src/stack.c
index 59ae826..6a8a3ee 100644
--- a/src/stack.c
+++ b/src/stack.c
@@ -28,11 +28,16 @@
 #include <fcntl.h>
 #include <sys/ptrace.h>
 #include <sys/wait.h>
+#include <dirent.h>
 #include ELFUTILS_HEADER(dwfl)
 
 #include <dwarf.h>
 #include <system.h>
 
+#ifndef PTRACE_EVENT_STOP
+# define PTRACE_EVENT_STOP 128
+#endif
+
 /* Name and version of program.  */
 static void print_version (FILE *stream, struct argp_state *state);
 ARGP_PROGRAM_VERSION_HOOK_DEF = print_version;
@@ -480,6 +485,74 @@ print_version (FILE *stream, struct argp_state *state __attribute__ ((unused)))
   fprintf (stream, "stack (%s) %s\n", PACKAGE_NAME, PACKAGE_VERSION);
 }
 
+/* Attach thread TID without waiting for it.  It must not be a crashed thread
+   core_pattern has been called for.  */
+
+static void
+core_pattern_seize_one (pid_t tid)
+{
+  if (ptrace (PTRACE_SEIZE, tid, NULL, NULL) != 0)
+    error (EXIT_BAD, errno, "ptrace (PTRACE_SEIZE, %d)", tid);
+  if (ptrace (PTRACE_INTERRUPT, tid, NULL, NULL) != 0)
+    error (EXIT_BAD, errno, "ptrace (PTRACE_INTERRUPT, %d)", tid);
+}
+
+/* Wait for thread TID.  It must not be a crashed thread core_pattern has been
+   called for.  */
+
+static void
+core_pattern_wait_for_one (pid_t tid)
+{
+  int status;
+  if (waitpid (tid, &status, __WALL) != tid)
+    error (EXIT_BAD, errno, "waitpid (%d, __WALL)", tid);
+  if (! WIFSIGNALED (status))
+    error (EXIT_BAD, 0, "waitpid (%d, __WALL) status 0x%x is not WIFSIGNALED",
+           tid, status);
+}
+
+/* Call CB for all threads of process EXCEPT_TID except tid EXCEPT_TID.
+   EXCEPT_TID does not have to be thread group leader.  */
+
+static void
+core_pattern_attach_other (pid_t except_tid, void (*cb) (pid_t tid))
+{
+  char dirname[64];
+  int i = snprintf (dirname, sizeof (dirname), "/proc/%d/task", except_tid);
+  assert (i > 0 && i < (ssize_t) sizeof (dirname) - 1);
+  DIR *dir = opendir (dirname);
+  if (dir == NULL)
+    error (EXIT_BAD, errno, "opendir (\"%s\")", dirname);
+  for (;;)
+    {
+      errno = 0;
+      struct dirent *dirent = readdir (dir);
+      if (dirent == NULL)
+	{
+	  if (errno != 0)
+	    error (EXIT_BAD, errno, "readdir (\"%s\")", dirname);
+	  break;
+	}
+      if (strcmp (dirent->d_name, ".") == 0
+	  || strcmp (dirent->d_name, "..") == 0)
+	continue;
+      char *end;
+      errno = 0;
+      long tidl = strtol (dirent->d_name, &end, 10);
+      if (errno != 0)
+	error (EXIT_BAD, errno, "strtol (\"%s\"->\"%s\")", dirname,
+	       dirent->d_name);
+      pid_t tid = tidl;
+      if (tidl <= 0 || (end && *end) || tid != tidl)
+	error (EXIT_BAD, 0, "Invalid TID (\"%s\"->\"%s\")", dirname,
+	       dirent->d_name);
+      if (tid != except_tid)
+	cb (tid);
+    }
+  if (closedir (dir) != 0)
+    error (EXIT_BAD, errno, "closedir (\"%s\")", dirname);
+}
+
 /* Provide PTRACE_ATTACH like operation compatible with Linux core_pattern
    handler.  */
 
@@ -489,10 +562,12 @@ core_pattern (void)
   if (ptrace (PTRACE_SEIZE, pid, NULL, (void *) (uintptr_t) PTRACE_O_TRACEEXIT)
       != 0)
     error (EXIT_BAD, errno, "ptrace (PTRACE_SEIZE, PTRACE_O_TRACEEXIT)");
+  if (! show_one_tid)
+    core_pattern_attach_other (pid, core_pattern_seize_one);
   if (close (0) != 0)
     error (EXIT_BAD, errno, "close (0; core file fd)");
   int status;
-  pid_t got = waitpid (pid, &status, 0);
+  pid_t got = waitpid (pid, &status, __WALL);
   if (got == -1)
     error (EXIT_BAD, errno, "waitpid ()");
   if (got != pid)
@@ -507,6 +582,8 @@ core_pattern (void)
 	   "waitpid () returned status 0x%x but (status >> 8)"
 	   " == (SIGTRAP | (PTRACE_EVENT_EXIT << 8)) was expected",
 	   status);
+  if (! show_one_tid)
+    core_pattern_attach_other (pid, core_pattern_wait_for_one);
 }
 
 static error_t
@@ -610,10 +687,6 @@ parse_opt (int key, char *arg __attribute__ ((unused)),
 	argp_error (state,
 		    N_("One of -p PID or --core COREFILE should be given."));
 
-      if (opt_core_pattern == true && show_one_tid == false)
-	argp_error (state,
-		    N_("--core-pattern requires -1"));
-
       if (pid != 0)
 	{
 	  dwfl = dwfl_begin (&proc_callbacks);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-11-03 13:22 Mark Wielaard
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2014-11-03 13:22 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

On Thu, 2014-10-30 at 22:55 +0100, Mark Wielaard wrote:
> I did have to lookup how to use this though. And I couldn't make it work
> without adding an -o option to eu-stack to explicitly redirect output
> since it seems you cannot use normal shell redirections
> in /proc/sys/kernel/core_pattern. So I had to use something like
> --output=/proc/%p/cwd/%i.stack. Where does the stdout of the
> core_pattern end up otherwise?

For reference. Attached is the cleaned up -o support patch I added.
It is a little paranoid setting umask and using O_EXCL explicitly.
But I think that is correct behavior when used in cases where normal
shell redirection doesn't work.

Cheers,

Mark

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-stack-Add-output-o-option.patch --]
[-- Type: text/x-patch, Size: 7121 bytes --]

From 6e06c911e1d2a0f9cbd07a6bcaa6f76d6d7648b4 Mon Sep 17 00:00:00 2001
From: Mark Wielaard <mjw@redhat.com>
Date: Mon, 3 Nov 2014 14:16:57 +0100
Subject: [PATCH] stack: Add --output, -o option.

-o, --output=FILE Place output into FILE. FILE should not yet exists and
                  will be created.

Signed-off-by: Mark Wielaard <mjw@redhat.com>
---
 src/ChangeLog | 10 ++++++++++
 src/stack.c   | 56 ++++++++++++++++++++++++++++++++++++--------------------
 2 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index a252cdc..70d6325 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,13 @@
+2014-19-03  Mark Wielaard  <mjw@redhat.com>
+
+	* stack.c (fout): New static variable.
+	(module_callback): fprintf to fout.
+	(print_frame): Likewise.
+	(print_frames): Likewise.
+	(parse_opt): Handle 'o'.
+	(main): Add --output, -o to options. Set fout to stdout.
+	fprintf to fout.
+
 2014-09-14  Petr Machata  <pmachata@redhat.com>
 
 	* readelf.c (handle_relocs_rela): Typo fix, test DESTSHDR properly.
diff --git a/src/stack.c b/src/stack.c
index c277dfd..0b1d235 100644
--- a/src/stack.c
+++ b/src/stack.c
@@ -26,6 +26,8 @@
 #include <string.h>
 #include <locale.h>
 #include <fcntl.h>
+#include <sys/stat.h>
+#include <sys/types.h>
 #include ELFUTILS_HEADER(dwfl)
 
 #include <dwarf.h>
@@ -97,6 +99,9 @@ static char *demangle_buffer = NULL;
 /* Whether any frames have been shown at all.  Determines exit status.  */
 static bool frames_shown = false;
 
+/* Where --output/-o goes to (stdout by default).  */
+static FILE *fout;
+
 /* Program exit codes. All frames shown without any errors is GOOD.
    Some frames shown with some non-fatal errors is an ERROR.  A fatal
    error or no frames shown at all is BAD.  A command line USAGE exit
@@ -147,7 +152,7 @@ module_callback (Dwfl_Module *mod, void **userdata __attribute__((unused)),
   assert (strcmp (modname, name) == 0);
 
   int width = get_addr_width (mod);
-  printf ("0x%0*" PRIx64 "-0x%0*" PRIx64 " %s\n",
+  fprintf (fout, "0x%0*" PRIx64 "-0x%0*" PRIx64 " %s\n",
 	  width, start, width, end, basename (name));
 
   const unsigned char *id;
@@ -155,17 +160,17 @@ module_callback (Dwfl_Module *mod, void **userdata __attribute__((unused)),
   int id_len = dwfl_module_build_id (mod, &id, &id_vaddr);
   if (id_len > 0)
     {
-      printf ("  [");
+      fprintf (fout, "  [");
       do
-	printf ("%02" PRIx8, *id++);
+	fprintf (fout, "%02" PRIx8, *id++);
       while (--id_len > 0);
-      printf ("]\n");
+      fprintf (fout, "]\n");
     }
 
   if (elf != NULL)
-    printf ("  %s\n", mainfile != NULL ? mainfile : "-");
+    fprintf (fout, "  %s\n", mainfile != NULL ? mainfile : "-");
   if (dwarf != NULL)
-    printf ("  %s\n", debugfile != NULL ? debugfile : "-");
+    fprintf (fout, "  %s\n", debugfile != NULL ? debugfile : "-");
 
   return DWARF_CB_OK;
 }
@@ -219,10 +224,10 @@ print_frame (int nr, Dwarf_Addr pc, bool isactivation,
 	     Dwarf_Die *die)
 {
   int width = get_addr_width (mod);
-  printf ("#%-2u 0x%0*" PRIx64, nr, width, (uint64_t) pc);
+  fprintf (fout, "#%-2u 0x%0*" PRIx64, nr, width, (uint64_t) pc);
 
   if (show_activation)
-    printf ("%4s", ! isactivation ? "- 1" : "");
+    fprintf (fout, "%4s", ! isactivation ? "- 1" : "");
 
   if (symname != NULL)
     {
@@ -237,7 +242,7 @@ print_frame (int nr, Dwarf_Addr pc, bool isactivation,
 	    symname = demangle_buffer = dsymname;
 	}
 #endif
-      printf (" %s", symname);
+      fprintf (fout, " %s", symname);
     }
 
   const char* fname;
@@ -247,7 +252,7 @@ print_frame (int nr, Dwarf_Addr pc, bool isactivation,
   if (show_module)
     {
       if (fname != NULL)
-	printf (" - %s", fname);
+	fprintf (fout, " - %s", fname);
     }
 
   if (show_build_id)
@@ -257,11 +262,11 @@ print_frame (int nr, Dwarf_Addr pc, bool isactivation,
       int id_len = dwfl_module_build_id (mod, &id, &id_vaddr);
       if (id_len > 0)
 	{
-	  printf ("\n    [");
+	  fprintf (fout, "\n    [");
 	  do
-	    printf ("%02" PRIx8, *id++);
+	    fprintf (fout, "%02" PRIx8, *id++);
 	  while (--id_len > 0);
-	  printf ("]@0x%0" PRIx64 "+0x%" PRIx64,
+	  fprintf (fout, "]@0x%0" PRIx64 "+0x%" PRIx64,
 		  start, pc_adjusted - start);
 	}
     }
@@ -303,16 +308,16 @@ print_frame (int nr, Dwarf_Addr pc, bool isactivation,
 
       if (sname != NULL)
 	{
-	  printf ("\n    %s", sname);
+	  fprintf (fout, "\n    %s", sname);
 	  if (line > 0)
 	    {
-	      printf (":%d", line);
+	      fprintf (fout, ":%d", line);
 	      if (col > 0)
-		printf (":%d", col);
+		fprintf (fout, ":%d", col);
 	    }
 	}
     }
-  printf ("\n");
+  fprintf (fout, "\n");
 }
 
 static void
@@ -362,7 +367,7 @@ print_frames (struct frames *frames, pid_t tid, int dwflerr, const char *what)
   if (frames->frames > 0)
     frames_shown = true;
 
-  printf ("TID %d:\n", tid);
+  fprintf (fout, "TID %d:\n", tid);
   int frame_nr = 0;
   for (int nr = 0; nr < frames->frames && (maxframes == 0
 					   || frame_nr < maxframes); nr++)
@@ -560,6 +565,14 @@ parse_opt (int key, char *arg __attribute__ ((unused)),
       show_modules = true;
       break;
 
+    case 'o':
+      umask (S_IWGRP | S_IWOTH);
+      fout = fopen (arg, "wx");
+      if (fout == NULL)
+	error (EXIT_BAD, errno,
+	       N_("Cannot create and exclusively open output file '%s'"), arg);
+      break;
+
     case ARGP_KEY_END:
       if (core == NULL && exec != NULL)
 	argp_error (state,
@@ -658,6 +671,8 @@ main (int argc, char **argv)
 	N_("Additionally show inlined function frames using DWARF debuginfo if available (implies -d)"), 0 },
       { "module",  'm', NULL, 0,
 	N_("Additionally show module file information"), 0 },
+      { "output",  'o', "FILE", 0,
+	N_("Place output into FILE. FILE should not yet exists and will be created."), 0 },
       { "source",  's', NULL, 0,
 	N_("Additionally show source file information"), 0 },
       { "verbose", 'v', NULL, 0,
@@ -690,11 +705,12 @@ occured the program exits with return code 2.  If the program was \
 invoked with bad or missing arguments it will exit with return code 64.")
     };
 
+  fout = stdout;
   argp_parse (&argp, argc, argv, 0, NULL, NULL);
 
   if (show_modules)
     {
-      printf ("PID %d - %s module memory map\n", dwfl_pid (dwfl),
+      fprintf (fout, "PID %d - %s module memory map\n", dwfl_pid (dwfl),
 	      pid != 0 ? "process" : "core");
       if (dwfl_getmodules (dwfl, module_callback, NULL, 0) != 0)
 	error (EXIT_BAD, 0, "dwfl_getmodules: %s", dwfl_errmsg (-1));
@@ -727,7 +743,7 @@ invoked with bad or missing arguments it will exit with return code 64.")
     }
   else
     {
-      printf ("PID %d - %s\n", dwfl_pid (dwfl), pid != 0 ? "process" : "core");
+      fprintf (fout, "PID %d - %s\n", dwfl_pid (dwfl), pid != 0 ? "process" : "core");
       switch (dwfl_getthreads (dwfl, thread_callback, &frames))
 	{
 	case DWARF_CB_OK:
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add --core-pattern option to eu-stack
@ 2014-10-30 21:55 Mark Wielaard
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2014-10-30 21:55 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1480 bytes --]

On Thu, 2014-10-09 at 23:25 +0200, Jan Kratochvil wrote:
> To backtrace crashed thread from Linux core_pattern handler a special operation
> similar to PTRACE_ATTACH needs to be done.  This is implemented as a demo in
> eu-stack's core_pattern function invoked by eu-stack's --core-pattern option.

Cute! And I saw the %i support made it into the mainline kernel:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b03023ecbdb76c1dec86b41ed80b123c22783220

I did have to lookup how to use this though. And I couldn't make it work
without adding an -o option to eu-stack to explicitly redirect output
since it seems you cannot use normal shell redirections
in /proc/sys/kernel/core_pattern. So I had to use something like
--output=/proc/%p/cwd/%i.stack. Where does the stdout of the
core_pattern end up otherwise?

It would probably be good to have a full example usage in the --help
message (and tell people to look at man 5 core).

> The code of core_pattern function has been suggested by Oleg Nesterov.

I was wondering if we could make this a little more generically usable
and call it --wait-exit or something like that. So people could also use
it outside the core_pattern if they just want to get a backtrace for a
known thread when it exits.

> +      if (opt_core_pattern == true && show_one_tid == false)
> +	argp_error (state,
> +		    N_("--core-pattern requires -1"));

Why this restriction?

Thanks,

Mark


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-11-28 15:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-27 11:17 [PATCH] Add --core-pattern option to eu-stack Mark Wielaard
  -- strict thread matches above, loose matches on Subject: below --
2014-11-28 15:04 Mark Wielaard
2014-11-28 14:44 Mark Wielaard
2014-11-28 14:27 Mark Wielaard
2014-11-27 21:47 Jan Kratochvil
2014-11-27 21:41 Jan Kratochvil
2014-11-27 21:27 Jan Kratochvil
2014-11-27 11:05 Mark Wielaard
2014-11-27 11:04 Mark Wielaard
2014-11-26 20:57 Oleg Nesterov
2014-11-26 20:26 Jan Kratochvil
2014-11-26 20:08 Oleg Nesterov
2014-11-26 14:47 Jan Kratochvil
2014-11-26 14:44 Jan Kratochvil
2014-11-26 14:32 Jan Kratochvil
2014-11-03 13:22 Mark Wielaard
2014-10-30 21:55 Mark Wielaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).