From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <elfutils-devel-return-6042-listarch-elfutils-devel=sourceware.org@sourceware.org>
Received: (qmail 130544 invoked by alias); 26 Apr 2017 14:33:24 -0000
Mailing-List: contact elfutils-devel-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <elfutils-devel.sourceware.org>
List-Post: <mailto:elfutils-devel@sourceware.org>
List-Help: <mailto:elfutils-devel-help@sourceware.org>
List-Subscribe: <mailto:elfutils-devel-subscribe@sourceware.org>
Sender: elfutils-devel-owner@sourceware.org
Received: (qmail 130518 invoked by uid 89); 26 Apr 2017 14:33:24 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Checked: by ClamAV 0.99.2 on sourceware.org
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-24.3 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2 spammy=925
X-Spam-Status: No, score=-24.3 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org
X-Spam-Level: 
X-HELO: gnu.wildebeest.org
Message-ID: <1493217200.31726.59.camel@klomp.org>
Subject: Re: [PATCH 5/5] Add frame pointer unwinding for aarch64
From: Mark Wielaard <mark@klomp.org>
To: Ulf Hermann <ulf.hermann@qt.io>
Cc: elfutils-devel@sourceware.org
Date: Wed, 26 Apr 2017 15:24:00 -0000
In-Reply-To: <3b0d6718-cf17-9ae1-b5f7-8c6413b8d3d2@qt.io>
References: <1493124006.31726.33.camel@klomp.org>
	 <1493124579-21017-1-git-send-email-mark@klomp.org>
	 <1493124579-21017-5-git-send-email-mark@klomp.org>
	 <1493125881.31726.44.camel@klomp.org>
	 <3b0d6718-cf17-9ae1-b5f7-8c6413b8d3d2@qt.io>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Evolution 3.12.11 (3.12.11-22.el7) 
Mime-Version: 1.0
X-SW-Source: 2017-q2/txt/msg00098.txt.bz2

On Tue, 2017-04-25 at 15:38 +0200, Ulf Hermann wrote:
> > My question is about this "initial frame". In our testcase we don't have
> > this case since the backtrace starts in a function that has some CFI.
> > But I assume you have some tests that rely on this behavior.
>=20
> Actually the test I provided does exercise this code. The initial
> __libc_do_syscall() frame does not have CFI. Only raise() has. You can
> check that by dropping the code for pc & 0x1.

Maybe I am using the wrong binaries (exec and core), but for me there is
no difference.

With or with commenting out the adjustments:

diff --git a/backends/aarch64_unwind.c b/backends/aarch64_unwind.c
index cac4ebd..36cd0e1 100644
--- a/backends/aarch64_unwind.c
+++ b/backends/aarch64_unwind.c
@@ -63,6 +63,7 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), Dwarf=
_Addr pc __attribute__
=20
   // The initial frame is special. We are expected to return lr directly i=
n this case, and we'll
   // come back to the same frame again in the next round.
+/*
   if ((pc & 0x1) =3D=3D 0)
     {
       newLr =3D lr;
@@ -70,6 +71,7 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), Dwarf=
_Addr pc __attribute__
       newSp =3D sp;
     }
   else
+*/
     {
       if (!readfunc(fp + LR_OFFSET, &newLr, arg))
         newLr =3D 0;
@@ -80,7 +82,7 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), Dwarf=
_Addr pc __attribute__
       newSp =3D fp + SP_OFFSET;
     }
=20
-  newPc =3D newLr & (~0x1);
+  newPc =3D newLr /* & (~0x1) */;
   if (!setfunc(-1, 1, &newPc, arg))
     return false;
=20
@@ -92,5 +94,5 @@ EBLHOOK(unwind) (Ebl *ebl __attribute__ ((unused)), Dwarf=
_Addr pc __attribute__
   // If the fp is invalid, we might still have a valid lr.
   // But if the fp is valid, then the stack should be moving in the right =
direction.
   // Except, if this is the initial frame. Then the stack doesn't move.
-  return newPc !=3D 0 && (fp =3D=3D 0 || newSp > sp || (pc & 0x1) =3D=3D 0=
);
+  return newPc !=3D 0 && (fp =3D=3D 0 || newSp > sp /* || (pc & 0x1) =3D=
=3D 0 */);
 }

The testcase (run-backtrace-fp-core-aarch64.sh) PASSes and produces the
same output for:

LD_LIBRARY_PATH=3Dbackends:libelf:libdw src/stack -v --exec
backtrace.aarch64.fp.exec --core backtrace.aarch64.fp.core

PID 349 - core
TID 350:
#0  0x000000000040583c     raise - /home/ulf/backtrace.aarch64.fp.exec
    ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x0000000000401aac - 1 sigusr2 - /home/ulf/backtrace.aarch64.fp.exec
#2  0x0000000000401ba8 - 1 stdarg - /home/ulf/backtrace.aarch64.fp.exec
#3  0x0000000000401c04 - 1 backtracegen - /home/ulf/backtrace.aarch64.fp.ex=
ec
#4  0x0000000000401c10 - 1 start - /home/ulf/backtrace.aarch64.fp.exec
#5  0x0000000000402f44 - 1 start_thread - /home/ulf/backtrace.aarch64.fp.ex=
ec
    /build/glibc-MsMi75/glibc-2.19/nptl/pthread_create.c:311
#6  0x000000000041dc70 - 1 __clone - /home/ulf/backtrace.aarch64.fp.exec
TID 349:
#0  0x0000000000403fcc     pthread_join - /home/ulf/backtrace.aarch64.fp.ex=
ec
    /build/glibc-MsMi75/glibc-2.19/nptl/pthread_join.c:92
#1  0x0000000000401810 - 1 main - /home/ulf/backtrace.aarch64.fp.exec
#2  0x0000000000406544 - 1 __libc_start_main - /home/ulf/backtrace.aarch64.=
fp.exec
#3  0x0000000000401918 - 1 $x - /home/ulf/backtrace.aarch64.fp.exec
src/stack: dwfl_thread_getframes tid 349 at 0x401917 in /home/ulf/backtrace=
.aarch64.fp.exec: address out of range

Since I cannot find the __libc_do_syscall I assume I am not using the
right executable & core? Could you double check them on the
mjw/fp-unwind branch?

> > The first question is how/why the (pc & 0x1) =3D=3D 0 check works?
> > Why is that the correct check?
> >=20
> > Secondly, if it really is the initial (or signal frame) we are after,
> > should we pass in into bool *signal_framep argument. Currently we don't
>=20
> We have this piece of code in __libdwfl_frame_unwind, in frame_unwind.c:
>=20
>   if (! state->initial_frame && ! state->signal_frame)
>       pc--;
>=20
> AArch64 has a fixed instruction width of 32bit. So, normally the pc is
> aligned to 4 bytes. Except if we decrement it, then we are guaranteed
> to have an odd number, which we can then test to see if the frame in
> question is the initial or a signal frame.

Aha, OK. I forgot we explicitly decrement the pc for the frame before
doing the actual unwind. That makes sense.

> Of course it would be nicer to pass this information directly, but the
> signal_frame parameter is supposed to be an output parameter. After
> all we do the following after calling ebl_unwind():
>=20
>   state->unwound->signal_frame =3D signal_frame;

Right, but that doesn't mean we couldn't also provide it as input if we
know that it is a signal or initial frame already. It just means that
unwinders would have to explicitly set it to false if cannot determine
it for the unwound frame (which is for all of them except the s390x
unwinder). It would really be just one line change in the call to and in
the unwinder functions. This isn't a public API, so we can change it to
be smarter.

Cheers,

Mark