public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* GDB: etm traces decoding and breakpoints for arm targets
@ 2020-10-31 23:10 Zied Guermazi
  2020-11-02 11:59 ` Mike Leach
  2020-11-04 16:04 ` Luis Machado
  0 siblings, 2 replies; 5+ messages in thread
From: Zied Guermazi @ 2020-10-31 23:10 UTC (permalink / raw)
  To: coresight, toolchain, gdb

[-- Attachment #1: Type: text/plain, Size: 3093 bytes --]

hi,

while testing the implementation in gdb of branch tracing on arm 
processors using etm, I faced the the situation where a breakpoint was 
set, was hit and then the execution of the program was continued.  While 
decoding generated traces,  I got the address of the breakpoint 
(0x400552) executed twice, and then the following address (0x400554) 
also executed twice. the instruction at (0x400554) is a BL ( a function 
call) and the second execution corrupts the function history.

here is a dump of generated trace elements


---------------------------------
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400552
end addr   = 0x400554
instructions count = 1
last_i_type: OCSD_INSTR_OTHER
last_i_subtype: OCSD_S_INSTR_NONE
last instruction was executed
last instruction size: 2
---------------------------------
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400552
end addr   = 0x400554
instructions count = 1
last_i_type: OCSD_INSTR_OTHER
last_i_subtype: OCSD_S_INSTR_NONE
last instruction was executed
last instruction size: 2
---------------------------------
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400554
end addr   = 0x400558
instructions count = 1
last_i_type: OCSD_INSTR_BR
last_i_subtype: OCSD_S_INSTR_BR_LINK
last instruction was executed
last instruction size: 4
---------------------------------
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400554
end addr   = 0x400558
instructions count = 1
last_i_type: OCSD_INSTR_BR
last_i_subtype: OCSD_S_INSTR_BR_LINK
last instruction was executed
last instruction size: 4

the explanation I have for this behavior is that :

-when setting the software breakpoint, the memory content of the 
instruction (at 0x400552) was altered to the instruction BKPT,

-when the breakpoint was hit, the original opcode was set at (0x400552) 
and a BKPT was set to the next instruction address (0x400554), then the 
execution was continued

-when the second breakpoint (0x400554) was hit, the a BKPT opcode was 
set at (0x400552) and the original opcode was set at (0x400554) then the 
execution was continued

I am using the function "int target_read_code (CORE_ADDR memaddr, 
gdb_byte *myaddr, ssize_t len)" to give program memory content to the 
decoder. so the collected etm traces are correct, but, as memory was 
altered in between, the decoder is "cheated".

I need to identify the re-execution of code due to breakpoint handling, 
and roll back its impact on etm decoding.

is there a mean to get the actual content of program memory including 
patched addresses?

is there a means of getting the history of patched addresses during the 
debugging of a program?

what is the type and subtype of a BKPT instruction in a decoded trace 
elements?

do you have any other idea for handling this situation?


I am attaching the source code of the program as well as the 
disassembled binary. the code was compiled as an application running on 
linux on an ARMv7 A (STM32MP157 SoC). the breakpoint was set at line 43 
in the source code (line 238 in the disassembled code)


Kind Regards

Zied Guermazi


[-- Attachment #2: function_call_history.s --]
[-- Type: text/plain, Size: 8811 bytes --]


function_call_history:     file format elf32-littlearm


Disassembly of section .init:

00000380 <_init>:
 380:	e92d4008 	push	{r3, lr}
 384:	eb000023 	bl	418 <call_weak_fn>
 388:	e8bd8008 	pop	{r3, pc}

Disassembly of section .plt:

0000038c <.plt>:
 38c:	e52de004 	push	{lr}		; (str lr, [sp, #-4]!)
 390:	e59fe004 	ldr	lr, [pc, #4]	; 39c <.plt+0x10>
 394:	e08fe00e 	add	lr, pc, lr
 398:	e5bef008 	ldr	pc, [lr, #8]!
 39c:	00010c2c 	.word	0x00010c2c

000003a0 <__cxa_finalize@plt>:
 3a0:	e28fc600 	add	ip, pc, #0, 12
 3a4:	e28cca10 	add	ip, ip, #16, 20	; 0x10000
 3a8:	e5bcfc2c 	ldr	pc, [ip, #3116]!	; 0xc2c

000003ac <__libc_start_main@plt>:
 3ac:	e28fc600 	add	ip, pc, #0, 12
 3b0:	e28cca10 	add	ip, ip, #16, 20	; 0x10000
 3b4:	e5bcfc24 	ldr	pc, [ip, #3108]!	; 0xc24

000003b8 <__gmon_start__@plt>:
 3b8:	e28fc600 	add	ip, pc, #0, 12
 3bc:	e28cca10 	add	ip, ip, #16, 20	; 0x10000
 3c0:	e5bcfc1c 	ldr	pc, [ip, #3100]!	; 0xc1c

000003c4 <abort@plt>:
 3c4:	e28fc600 	add	ip, pc, #0, 12
 3c8:	e28cca10 	add	ip, ip, #16, 20	; 0x10000
 3cc:	e5bcfc14 	ldr	pc, [ip, #3092]!	; 0xc14

Disassembly of section .text:

000003d0 <_start>:
 3d0:	f04f 0b00 	mov.w	fp, #0
 3d4:	f04f 0e00 	mov.w	lr, #0
 3d8:	bc02      	pop	{r1}
 3da:	466a      	mov	r2, sp
 3dc:	b404      	push	{r2}
 3de:	b401      	push	{r0}
 3e0:	f8df a024 	ldr.w	sl, [pc, #36]	; 408 <_start+0x38>
 3e4:	a308      	add	r3, pc, #32	; (adr r3, 408 <_start+0x38>)
 3e6:	449a      	add	sl, r3
 3e8:	f8df c020 	ldr.w	ip, [pc, #32]	; 40c <_start+0x3c>
 3ec:	f85a c00c 	ldr.w	ip, [sl, ip]
 3f0:	f84d cd04 	str.w	ip, [sp, #-4]!
 3f4:	4b06      	ldr	r3, [pc, #24]	; (410 <_start+0x40>)
 3f6:	f85a 3003 	ldr.w	r3, [sl, r3]
 3fa:	4806      	ldr	r0, [pc, #24]	; (414 <_start+0x44>)
 3fc:	f85a 0000 	ldr.w	r0, [sl, r0]
 400:	f7ff efd4 	blx	3ac <__libc_start_main@plt>
 404:	f7ff efde 	blx	3c4 <abort@plt>
 408:	00010bc0 	.word	0x00010bc0
 40c:	0000001c 	.word	0x0000001c
 410:	0000002c 	.word	0x0000002c
 414:	00000030 	.word	0x00000030

00000418 <call_weak_fn>:
 418:	e59f3014 	ldr	r3, [pc, #20]	; 434 <call_weak_fn+0x1c>
 41c:	e59f2014 	ldr	r2, [pc, #20]	; 438 <call_weak_fn+0x20>
 420:	e08f3003 	add	r3, pc, r3
 424:	e7932002 	ldr	r2, [r3, r2]
 428:	e3520000 	cmp	r2, #0
 42c:	012fff1e 	bxeq	lr
 430:	eaffffe0 	b	3b8 <__gmon_start__@plt>
 434:	00010ba0 	.word	0x00010ba0
 438:	00000028 	.word	0x00000028

0000043c <deregister_tm_clones>:
 43c:	4806      	ldr	r0, [pc, #24]	; (458 <deregister_tm_clones+0x1c>)
 43e:	4b07      	ldr	r3, [pc, #28]	; (45c <deregister_tm_clones+0x20>)
 440:	4478      	add	r0, pc
 442:	4a07      	ldr	r2, [pc, #28]	; (460 <deregister_tm_clones+0x24>)
 444:	447b      	add	r3, pc
 446:	4283      	cmp	r3, r0
 448:	447a      	add	r2, pc
 44a:	d003      	beq.n	454 <deregister_tm_clones+0x18>
 44c:	4b05      	ldr	r3, [pc, #20]	; (464 <deregister_tm_clones+0x28>)
 44e:	58d3      	ldr	r3, [r2, r3]
 450:	b103      	cbz	r3, 454 <deregister_tm_clones+0x18>
 452:	4718      	bx	r3
 454:	4770      	bx	lr
 456:	bf00      	nop
 458:	00010bc4 	.word	0x00010bc4
 45c:	00010bc0 	.word	0x00010bc0
 460:	00010b7c 	.word	0x00010b7c
 464:	00000024 	.word	0x00000024

00000468 <register_tm_clones>:
 468:	4808      	ldr	r0, [pc, #32]	; (48c <register_tm_clones+0x24>)
 46a:	4b09      	ldr	r3, [pc, #36]	; (490 <register_tm_clones+0x28>)
 46c:	4478      	add	r0, pc
 46e:	4a09      	ldr	r2, [pc, #36]	; (494 <register_tm_clones+0x2c>)
 470:	447b      	add	r3, pc
 472:	1a19      	subs	r1, r3, r0
 474:	447a      	add	r2, pc
 476:	1089      	asrs	r1, r1, #2
 478:	eb01 71d1 	add.w	r1, r1, r1, lsr #31
 47c:	1049      	asrs	r1, r1, #1
 47e:	d003      	beq.n	488 <register_tm_clones+0x20>
 480:	4b05      	ldr	r3, [pc, #20]	; (498 <register_tm_clones+0x30>)
 482:	58d3      	ldr	r3, [r2, r3]
 484:	b103      	cbz	r3, 488 <register_tm_clones+0x20>
 486:	4718      	bx	r3
 488:	4770      	bx	lr
 48a:	bf00      	nop
 48c:	00010b98 	.word	0x00010b98
 490:	00010b94 	.word	0x00010b94
 494:	00010b50 	.word	0x00010b50
 498:	00000034 	.word	0x00000034

0000049c <__do_global_dtors_aux>:
 49c:	b508      	push	{r3, lr}
 49e:	4b0a      	ldr	r3, [pc, #40]	; (4c8 <__do_global_dtors_aux+0x2c>)
 4a0:	4a0a      	ldr	r2, [pc, #40]	; (4cc <__do_global_dtors_aux+0x30>)
 4a2:	447b      	add	r3, pc
 4a4:	447a      	add	r2, pc
 4a6:	781b      	ldrb	r3, [r3, #0]
 4a8:	b96b      	cbnz	r3, 4c6 <__do_global_dtors_aux+0x2a>
 4aa:	4b09      	ldr	r3, [pc, #36]	; (4d0 <__do_global_dtors_aux+0x34>)
 4ac:	58d3      	ldr	r3, [r2, r3]
 4ae:	b123      	cbz	r3, 4ba <__do_global_dtors_aux+0x1e>
 4b0:	4b08      	ldr	r3, [pc, #32]	; (4d4 <__do_global_dtors_aux+0x38>)
 4b2:	447b      	add	r3, pc
 4b4:	6818      	ldr	r0, [r3, #0]
 4b6:	f7ff ef74 	blx	3a0 <__cxa_finalize@plt>
 4ba:	f7ff ffbf 	bl	43c <deregister_tm_clones>
 4be:	4b06      	ldr	r3, [pc, #24]	; (4d8 <__do_global_dtors_aux+0x3c>)
 4c0:	2201      	movs	r2, #1
 4c2:	447b      	add	r3, pc
 4c4:	701a      	strb	r2, [r3, #0]
 4c6:	bd08      	pop	{r3, pc}
 4c8:	00010b62 	.word	0x00010b62
 4cc:	00010b20 	.word	0x00010b20
 4d0:	00000020 	.word	0x00000020
 4d4:	00010b4e 	.word	0x00010b4e
 4d8:	00010b42 	.word	0x00010b42

000004dc <frame_dummy>:
 4dc:	e7c4      	b.n	468 <register_tm_clones>
 4de:	bf00      	nop

000004e0 <inc>:
   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

int
inc (int i)
{
 4e0:	b480      	push	{r7}
 4e2:	b083      	sub	sp, #12
 4e4:	af00      	add	r7, sp, #0
 4e6:	6078      	str	r0, [r7, #4]
  return i+1;
 4e8:	687b      	ldr	r3, [r7, #4]
 4ea:	3301      	adds	r3, #1
}
 4ec:	4618      	mov	r0, r3
 4ee:	370c      	adds	r7, #12
 4f0:	46bd      	mov	sp, r7
 4f2:	f85d 7b04 	ldr.w	r7, [sp], #4
 4f6:	4770      	bx	lr

000004f8 <fib>:

int
fib (int n)
{
 4f8:	b590      	push	{r4, r7, lr}
 4fa:	b083      	sub	sp, #12
 4fc:	af00      	add	r7, sp, #0
 4fe:	6078      	str	r0, [r7, #4]
  if (n <= 1)
 500:	687b      	ldr	r3, [r7, #4]
 502:	2b01      	cmp	r3, #1
 504:	dc01      	bgt.n	50a <fib+0x12>
    return n;
 506:	687b      	ldr	r3, [r7, #4]
 508:	e00c      	b.n	524 <fib+0x2c>

  return fib(n-2) + fib(n-1);
 50a:	687b      	ldr	r3, [r7, #4]
 50c:	3b02      	subs	r3, #2
 50e:	4618      	mov	r0, r3
 510:	f7ff fff2 	bl	4f8 <fib>
 514:	4604      	mov	r4, r0
 516:	687b      	ldr	r3, [r7, #4]
 518:	3b01      	subs	r3, #1
 51a:	4618      	mov	r0, r3
 51c:	f7ff ffec 	bl	4f8 <fib>
 520:	4603      	mov	r3, r0
 522:	4423      	add	r3, r4
}
 524:	4618      	mov	r0, r3
 526:	370c      	adds	r7, #12
 528:	46bd      	mov	sp, r7
 52a:	bd90      	pop	{r4, r7, pc}

0000052c <main>:

int
main (void)
{
 52c:	b580      	push	{r7, lr}
 52e:	b082      	sub	sp, #8
 530:	af00      	add	r7, sp, #0
  int i, j;

  for (i = 0; i < 10; i++)
 532:	2300      	movs	r3, #0
 534:	603b      	str	r3, [r7, #0]
 536:	e009      	b.n	54c <main+0x20>
    j += inc(i);
 538:	6838      	ldr	r0, [r7, #0]
 53a:	f7ff ffd1 	bl	4e0 <inc>
 53e:	4602      	mov	r2, r0
 540:	687b      	ldr	r3, [r7, #4]
 542:	4413      	add	r3, r2
 544:	607b      	str	r3, [r7, #4]
  for (i = 0; i < 10; i++)
 546:	683b      	ldr	r3, [r7, #0]
 548:	3301      	adds	r3, #1
 54a:	603b      	str	r3, [r7, #0]
 54c:	683b      	ldr	r3, [r7, #0]
 54e:	2b09      	cmp	r3, #9
 550:	ddf2      	ble.n	538 <main+0xc>

  j += fib(3); /* bp.1 */
 552:	2003      	movs	r0, #3
 554:	f7ff ffd0 	bl	4f8 <fib>
 558:	4602      	mov	r2, r0
 55a:	687b      	ldr	r3, [r7, #4]
 55c:	4413      	add	r3, r2
 55e:	607b      	str	r3, [r7, #4]
  return j; /* bp.2 */
 560:	687b      	ldr	r3, [r7, #4]
}
 562:	4618      	mov	r0, r3
 564:	3708      	adds	r7, #8
 566:	46bd      	mov	sp, r7
 568:	bd80      	pop	{r7, pc}
	...

0000056c <__libc_csu_init>:
 56c:	e92d 43f8 	stmdb	sp!, {r3, r4, r5, r6, r7, r8, r9, lr}
 570:	4607      	mov	r7, r0
 572:	4e0c      	ldr	r6, [pc, #48]	; (5a4 <__libc_csu_init+0x38>)
 574:	4688      	mov	r8, r1
 576:	4d0c      	ldr	r5, [pc, #48]	; (5a8 <__libc_csu_init+0x3c>)
 578:	4691      	mov	r9, r2
 57a:	447e      	add	r6, pc
 57c:	f7ff ef00 	blx	380 <_init>
 580:	447d      	add	r5, pc
 582:	1b76      	subs	r6, r6, r5
 584:	10b6      	asrs	r6, r6, #2
 586:	d00a      	beq.n	59e <__libc_csu_init+0x32>
 588:	3d04      	subs	r5, #4
 58a:	2400      	movs	r4, #0
 58c:	3401      	adds	r4, #1
 58e:	f855 3f04 	ldr.w	r3, [r5, #4]!
 592:	464a      	mov	r2, r9
 594:	4641      	mov	r1, r8
 596:	4638      	mov	r0, r7
 598:	4798      	blx	r3
 59a:	42a6      	cmp	r6, r4
 59c:	d1f6      	bne.n	58c <__libc_csu_init+0x20>
 59e:	e8bd 83f8 	ldmia.w	sp!, {r3, r4, r5, r6, r7, r8, r9, pc}
 5a2:	bf00      	nop
 5a4:	0001094e 	.word	0x0001094e
 5a8:	00010944 	.word	0x00010944

000005ac <__libc_csu_fini>:
 5ac:	4770      	bx	lr
 5ae:	bf00      	nop

Disassembly of section .fini:

000005b0 <_fini>:
 5b0:	e92d4008 	push	{r3, lr}
 5b4:	e8bd8008 	pop	{r3, pc}

[-- Attachment #3: function_call_history.c --]
[-- Type: text/x-csrc, Size: 1056 bytes --]

/* This testcase is part of GDB, the GNU debugger.

   Copyright 2013-2019 Free Software Foundation, Inc.

   Contributed by Intel Corp. <christian.himpel@intel.com>

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

int
inc (int i)
{
  return i+1;
}

int
fib (int n)
{
  if (n <= 1)
    return n;

  return fib(n-2) + fib(n-1);
}

int
main (void)
{
  int i, j;

  for (i = 0; i < 10; i++)
    j += inc(i);

  j += fib(3); /* bp.1 */
  return j; /* bp.2 */
}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: GDB: etm traces decoding and breakpoints for arm targets
  2020-10-31 23:10 GDB: etm traces decoding and breakpoints for arm targets Zied Guermazi
@ 2020-11-02 11:59 ` Mike Leach
  2020-11-02 15:52   ` Zied Guermazi
  2020-11-04 16:04 ` Luis Machado
  1 sibling, 1 reply; 5+ messages in thread
From: Mike Leach @ 2020-11-02 11:59 UTC (permalink / raw)
  To: Zied Guermazi; +Cc: Coresight ML, toolchain, gdb

Hi Zeid,

On Sat, 31 Oct 2020 at 23:11, Zied Guermazi <zied.guermazi@trande.de> wrote:
>
> hi,
>
> while testing the implementation in gdb of branch tracing on arm
> processors using etm, I faced the the situation where a breakpoint was
> set, was hit and then the execution of the program was continued.  While
> decoding generated traces,  I got the address of the breakpoint
> (0x400552) executed twice, and then the following address (0x400554)
> also executed twice. the instruction at (0x400554) is a BL ( a function
> call) and the second execution corrupts the function history.
>
> here is a dump of generated trace elements
>
>
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400552
> end addr   = 0x400554
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400552
> end addr   = 0x400554
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400554
> end addr   = 0x400558
> instructions count = 1
> last_i_type: OCSD_INSTR_BR
> last_i_subtype: OCSD_S_INSTR_BR_LINK
> last instruction was executed
> last instruction size: 4
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400554
> end addr   = 0x400558
> instructions count = 1
> last_i_type: OCSD_INSTR_BR
> last_i_subtype: OCSD_S_INSTR_BR_LINK
> last instruction was executed
> last instruction size: 4
>
> the explanation I have for this behavior is that :
>
> -when setting the software breakpoint, the memory content of the
> instruction (at 0x400552) was altered to the instruction BKPT,
>
> -when the breakpoint was hit, the original opcode was set at (0x400552)
> and a BKPT was set to the next instruction address (0x400554), then the
> execution was continued
>
> -when the second breakpoint (0x400554) was hit, the a BKPT opcode was
> set at (0x400552) and the original opcode was set at (0x400554) then the
> execution was continued
>
> I am using the function "int target_read_code (CORE_ADDR memaddr,
> gdb_byte *myaddr, ssize_t len)" to give program memory content to the
> decoder. so the collected etm traces are correct, but, as memory was
> altered in between, the decoder is "cheated".
>
> I need to identify the re-execution of code due to breakpoint handling,
> and roll back its impact on etm decoding.
>
> is there a mean to get the actual content of program memory including
> patched addresses?
>
> is there a means of getting the history of patched addresses during the
> debugging of a program?
>
> what is the type and subtype of a BKPT instruction in a decoded trace
> elements?
>
I can only really comment on this question. The type / subtype
information in the output from the decoder is generated from the
decoder walking the memory image of the executed trace - not from the
trace packets themselves.
The decoder classifies instructions according to how they will affect
trace flow with the "other" category being set for the majority of
instructions.  The categories are: other, branch, indirect branch, ISB
/ DSB / DMB / WFI / WFE.
These are important in program flow trace (PTM 1.x, ETM 4.x) as these
determine which instruction we attach the E/N atoms to. BKPT will be
classified as "other", if it is seen, as it has no effect on normal
program flow. It will cause an exception which has a specific trace
packet format.

Regards

Mike


> do you have any other idea for handling this situation?
>
>
> I am attaching the source code of the program as well as the
> disassembled binary. the code was compiled as an application running on
> linux on an ARMv7 A (STM32MP157 SoC). the breakpoint was set at line 43
> in the source code (line 238 in the disassembled code)
>
>
> Kind Regards
>
> Zied Guermazi
>
> _______________________________________________
> CoreSight mailing list
> CoreSight@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/coresight



-- 
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: GDB: etm traces decoding and breakpoints for arm targets
  2020-11-02 11:59 ` Mike Leach
@ 2020-11-02 15:52   ` Zied Guermazi
  2020-11-03 11:02     ` Mike Leach
  0 siblings, 1 reply; 5+ messages in thread
From: Zied Guermazi @ 2020-11-02 15:52 UTC (permalink / raw)
  To: Mike Leach, linaro-toolchain, Coresight ML
  Cc: markus.t.metzger, luis.machado, gdb

hi

Thanks Mike for your support, it was very helpful.

to put everything together, on arm, gdb inserts a sw breakpoint by 
patching the code with an undefined instruction ( see comments in  
arm-tdep.c line7687) when a breakpoint is hit, an exception number 9 
"Undefined Instruction exception" is raised and a branch packet with 
this info is generated in etm traces, the trap is get handled by the 
kernel and it sends the appropriate signal to gdb process.

when the user continues the execution, gdb patches back the code and 
executes the instruction. this leads to the instruction traced twice 
with an exception in between, the same happens for next executed instruction

here is the log of decoded packets

[btrace] [ftrace] update insn: fun = main, file = 
./function_call_history.c, level = 0, insn = [1; 2)
cs_etm_decoder_trace_element_callback: elem->elem_type 
OCSD_GEN_TRC_ELEM_INSTR_RANGE */<= first execution attempt that raises 
an undefined instruction exception/*
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400534
end addr   = 0x400536
instructions count = 1
last_i_type: OCSD_INSTR_OTHER
last_i_subtype: OCSD_S_INSTR_NONE
last instruction was executed
last instruction size: 2
[btrace] [ftrace] update insn: fun = main, file = 
./function_call_history.c, level = 0, insn = [1; 3)
cs_etm_decoder_trace_element_callback: elem->elem_type 
OCSD_GEN_TRC_ELEM_EXCEPTION */<= the exception is traced/*
trace_chan_id: 18
exception number: 9 */<= undefined instruction exception/*
cs_etm_decoder_trace_element_callback: elem->elem_type 
OCSD_GEN_TRC_ELEM_TRACE_ON
cs_etm_decoder_trace_element_callback: elem->elem_type 
OCSD_GEN_TRC_ELEM_PE_CONTEXT
cs_etm_decoder_trace_element_callback: elem->elem_type 
OCSD_GEN_TRC_ELEM_INSTR_RANGE */<= execution of the original instruction/*
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400534
end addr   = 0x400536
instructions count = 1
last_i_type: OCSD_INSTR_OTHER
last_i_subtype: OCSD_S_INSTR_NONE
last instruction was executed
last instruction size: 2

as the code was changed during execution, it can not be reconstructed 
during traces decoding.

in addition, and for tracing applications running on Linux, we are not 
interested in capturing raised exceptions, we can consider rolling back 
last instruction in ftraces. As this is not obvious, we can consider 
ignoring the repeated instruction as a workaround.

for tracing bare metal software, we need to keep tracing exception, so 
we can have a flag for ignoring exceptions, and activate or dis-activate 
it according to the context.

what do you think about it, shall I go for implementing it as described 
above?


Kind Regards

Zied Guermazi


On 02.11.20 12:59, Mike Leach wrote:
> Hi Zeid,
>
> On Sat, 31 Oct 2020 at 23:11, Zied Guermazi <zied.guermazi@trande.de> wrote:
>> hi,
>>
>> while testing the implementation in gdb of branch tracing on arm
>> processors using etm, I faced the the situation where a breakpoint was
>> set, was hit and then the execution of the program was continued.  While
>> decoding generated traces,  I got the address of the breakpoint
>> (0x400552) executed twice, and then the following address (0x400554)
>> also executed twice. the instruction at (0x400554) is a BL ( a function
>> call) and the second execution corrupts the function history.
>>
>> here is a dump of generated trace elements
>>
>>
>> ---------------------------------
>> trace_chan_id: 18
>> isa: CS_ETM_ISA_T32
>> start addr = 0x400552
>> end addr   = 0x400554
>> instructions count = 1
>> last_i_type: OCSD_INSTR_OTHER
>> last_i_subtype: OCSD_S_INSTR_NONE
>> last instruction was executed
>> last instruction size: 2
>> ---------------------------------
>> trace_chan_id: 18
>> isa: CS_ETM_ISA_T32
>> start addr = 0x400552
>> end addr   = 0x400554
>> instructions count = 1
>> last_i_type: OCSD_INSTR_OTHER
>> last_i_subtype: OCSD_S_INSTR_NONE
>> last instruction was executed
>> last instruction size: 2
>> ---------------------------------
>> trace_chan_id: 18
>> isa: CS_ETM_ISA_T32
>> start addr = 0x400554
>> end addr   = 0x400558
>> instructions count = 1
>> last_i_type: OCSD_INSTR_BR
>> last_i_subtype: OCSD_S_INSTR_BR_LINK
>> last instruction was executed
>> last instruction size: 4
>> ---------------------------------
>> trace_chan_id: 18
>> isa: CS_ETM_ISA_T32
>> start addr = 0x400554
>> end addr   = 0x400558
>> instructions count = 1
>> last_i_type: OCSD_INSTR_BR
>> last_i_subtype: OCSD_S_INSTR_BR_LINK
>> last instruction was executed
>> last instruction size: 4
>>
>> the explanation I have for this behavior is that :
>>
>> -when setting the software breakpoint, the memory content of the
>> instruction (at 0x400552) was altered to the instruction BKPT,
>>
>> -when the breakpoint was hit, the original opcode was set at (0x400552)
>> and a BKPT was set to the next instruction address (0x400554), then the
>> execution was continued
>>
>> -when the second breakpoint (0x400554) was hit, the a BKPT opcode was
>> set at (0x400552) and the original opcode was set at (0x400554) then the
>> execution was continued
>>
>> I am using the function "int target_read_code (CORE_ADDR memaddr,
>> gdb_byte *myaddr, ssize_t len)" to give program memory content to the
>> decoder. so the collected etm traces are correct, but, as memory was
>> altered in between, the decoder is "cheated".
>>
>> I need to identify the re-execution of code due to breakpoint handling,
>> and roll back its impact on etm decoding.
>>
>> is there a mean to get the actual content of program memory including
>> patched addresses?
>>
>> is there a means of getting the history of patched addresses during the
>> debugging of a program?
>>
>> what is the type and subtype of a BKPT instruction in a decoded trace
>> elements?
>>
> I can only really comment on this question. The type / subtype
> information in the output from the decoder is generated from the
> decoder walking the memory image of the executed trace - not from the
> trace packets themselves.
> The decoder classifies instructions according to how they will affect
> trace flow with the "other" category being set for the majority of
> instructions.  The categories are: other, branch, indirect branch, ISB
> / DSB / DMB / WFI / WFE.
> These are important in program flow trace (PTM 1.x, ETM 4.x) as these
> determine which instruction we attach the E/N atoms to. BKPT will be
> classified as "other", if it is seen, as it has no effect on normal
> program flow. It will cause an exception which has a specific trace
> packet format.
>
> Regards
>
> Mike
>
>
>> do you have any other idea for handling this situation?
>>
>>
>> I am attaching the source code of the program as well as the
>> disassembled binary. the code was compiled as an application running on
>> linux on an ARMv7 A (STM32MP157 SoC). the breakpoint was set at line 43
>> in the source code (line 238 in the disassembled code)
>>
>>
>> Kind Regards
>>
>> Zied Guermazi
>>
>> _______________________________________________
>> CoreSight mailing list
>> CoreSight@lists.linaro.org
>> https://lists.linaro.org/mailman/listinfo/coresight
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: GDB: etm traces decoding and breakpoints for arm targets
  2020-11-02 15:52   ` Zied Guermazi
@ 2020-11-03 11:02     ` Mike Leach
  0 siblings, 0 replies; 5+ messages in thread
From: Mike Leach @ 2020-11-03 11:02 UTC (permalink / raw)
  To: Zied Guermazi
  Cc: linaro-toolchain, Coresight ML, markus.t.metzger, luis.machado, gdb

Hi Zied,

On Mon, 2 Nov 2020 at 15:52, Zied Guermazi <zied.guermazi@trande.de> wrote:
>
> hi
>
> Thanks Mike for your support, it was very helpful.
>
> to put everything together, on arm, gdb inserts a sw breakpoint by patching the code with an undefined instruction ( see comments in  arm-tdep.c line7687) when a breakpoint is hit, an exception number 9 "Undefined Instruction exception" is raised and a branch packet with this info is generated in etm traces, the trap is get handled by the kernel and it sends the appropriate signal to gdb process.
>

Looks like that code was designed for very early architectures. I
wonder if it should not use the architected BKPT instruction when
available (arch v5T onwards I think).

> when the user continues the execution, gdb patches back the code and executes the instruction. this leads to the instruction traced twice with an exception in between, the same happens for next executed instruction
>
> here is the log of decoded packets
>
> [btrace] [ftrace] update insn: fun = main, file = ./function_call_history.c, level = 0, insn = [1; 2)
> cs_etm_decoder_trace_element_callback: elem->elem_type OCSD_GEN_TRC_ELEM_INSTR_RANGE <= first execution attempt that raises an undefined instruction exception
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400534
> end addr   = 0x400536
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
> [btrace] [ftrace] update insn: fun = main, file = ./function_call_history.c, level = 0, insn = [1; 3)
> cs_etm_decoder_trace_element_callback: elem->elem_type OCSD_GEN_TRC_ELEM_EXCEPTION  <= the exception is traced
> trace_chan_id: 18
> exception number: 9 <= undefined instruction exception
> cs_etm_decoder_trace_element_callback: elem->elem_type OCSD_GEN_TRC_ELEM_TRACE_ON
> cs_etm_decoder_trace_element_callback: elem->elem_type OCSD_GEN_TRC_ELEM_PE_CONTEXT
> cs_etm_decoder_trace_element_callback: elem->elem_type OCSD_GEN_TRC_ELEM_INSTR_RANGE <= execution of the original instruction
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400534
> end addr   = 0x400536
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
>
> as the code was changed during execution, it can not be reconstructed during traces decoding.
>
> in addition, and for tracing applications running on Linux, we are not interested in capturing raised exceptions, we can consider rolling back last instruction in ftraces. As this is not obvious, we can consider ignoring the repeated instruction as a workaround.
>
> for tracing bare metal software, we need to keep tracing exception, so we can have a flag for ignoring exceptions, and activate or dis-activate it according to the context.
>
> what do you think about it, shall I go for implementing it as described above?
>
>

I assume that in this scenario, trace collection is ongoing over the
BKPT hit / restart sequence and is decoded at some point later.
Otherwise spotting the breakpoint would be easy.

I cannot think of many circumstances where an instruction would be
executed  - or appear to be executed in the trace twice in succession*
- other than being restarted after an exception. This debug case is
one of those occasions - I would check that there are not other
exceptions that might mimic this.

Other than that it would appear that the execute / exception / execute
again pattern can be used to spot a break and the 1st execute could be
dropped since this was the breakpoint. If it was set on a conditional
then you are interested in the actual trace result which could be
either executed or not.

Regards

Mike

* branch to self might appear like this, as would setting the trace
address filters to a include just single instruction - but there would
be no intervening exceptions in these cases.

> Kind Regards
>
> Zied Guermazi
>
>
>
> On 02.11.20 12:59, Mike Leach wrote:
>
> Hi Zeid,
>
> On Sat, 31 Oct 2020 at 23:11, Zied Guermazi <zied.guermazi@trande.de> wrote:
>
> hi,
>
> while testing the implementation in gdb of branch tracing on arm
> processors using etm, I faced the the situation where a breakpoint was
> set, was hit and then the execution of the program was continued.  While
> decoding generated traces,  I got the address of the breakpoint
> (0x400552) executed twice, and then the following address (0x400554)
> also executed twice. the instruction at (0x400554) is a BL ( a function
> call) and the second execution corrupts the function history.
>
> here is a dump of generated trace elements
>
>
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400552
> end addr   = 0x400554
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400552
> end addr   = 0x400554
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400554
> end addr   = 0x400558
> instructions count = 1
> last_i_type: OCSD_INSTR_BR
> last_i_subtype: OCSD_S_INSTR_BR_LINK
> last instruction was executed
> last instruction size: 4
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400554
> end addr   = 0x400558
> instructions count = 1
> last_i_type: OCSD_INSTR_BR
> last_i_subtype: OCSD_S_INSTR_BR_LINK
> last instruction was executed
> last instruction size: 4
>
> the explanation I have for this behavior is that :
>
> -when setting the software breakpoint, the memory content of the
> instruction (at 0x400552) was altered to the instruction BKPT,
>
> -when the breakpoint was hit, the original opcode was set at (0x400552)
> and a BKPT was set to the next instruction address (0x400554), then the
> execution was continued
>
> -when the second breakpoint (0x400554) was hit, the a BKPT opcode was
> set at (0x400552) and the original opcode was set at (0x400554) then the
> execution was continued
>
> I am using the function "int target_read_code (CORE_ADDR memaddr,
> gdb_byte *myaddr, ssize_t len)" to give program memory content to the
> decoder. so the collected etm traces are correct, but, as memory was
> altered in between, the decoder is "cheated".
>
> I need to identify the re-execution of code due to breakpoint handling,
> and roll back its impact on etm decoding.
>
> is there a mean to get the actual content of program memory including
> patched addresses?
>
> is there a means of getting the history of patched addresses during the
> debugging of a program?
>
> what is the type and subtype of a BKPT instruction in a decoded trace
> elements?
>
> I can only really comment on this question. The type / subtype
> information in the output from the decoder is generated from the
> decoder walking the memory image of the executed trace - not from the
> trace packets themselves.
> The decoder classifies instructions according to how they will affect
> trace flow with the "other" category being set for the majority of
> instructions.  The categories are: other, branch, indirect branch, ISB
> / DSB / DMB / WFI / WFE.
> These are important in program flow trace (PTM 1.x, ETM 4.x) as these
> determine which instruction we attach the E/N atoms to. BKPT will be
> classified as "other", if it is seen, as it has no effect on normal
> program flow. It will cause an exception which has a specific trace
> packet format.
>
> Regards
>
> Mike
>
>
> do you have any other idea for handling this situation?
>
>
> I am attaching the source code of the program as well as the
> disassembled binary. the code was compiled as an application running on
> linux on an ARMv7 A (STM32MP157 SoC). the breakpoint was set at line 43
> in the source code (line 238 in the disassembled code)
>
>
> Kind Regards
>
> Zied Guermazi
>
> _______________________________________________
> CoreSight mailing list
> CoreSight@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/coresight
>
>


-- 
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: GDB: etm traces decoding and breakpoints for arm targets
  2020-10-31 23:10 GDB: etm traces decoding and breakpoints for arm targets Zied Guermazi
  2020-11-02 11:59 ` Mike Leach
@ 2020-11-04 16:04 ` Luis Machado
  1 sibling, 0 replies; 5+ messages in thread
From: Luis Machado @ 2020-11-04 16:04 UTC (permalink / raw)
  To: Zied Guermazi, coresight, toolchain, gdb

Hi,

On 10/31/20 8:10 PM, Zied Guermazi wrote:
> hi,
> 
> while testing the implementation in gdb of branch tracing on arm 
> processors using etm, I faced the the situation where a breakpoint was 
> set, was hit and then the execution of the program was continued.  While 
> decoding generated traces,  I got the address of the breakpoint 
> (0x400552) executed twice, and then the following address (0x400554) 
> also executed twice. the instruction at (0x400554) is a BL ( a function 
> call) and the second execution corrupts the function history.
> 
> here is a dump of generated trace elements
> 
> 
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400552
> end addr   = 0x400554
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400552
> end addr   = 0x400554
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400554
> end addr   = 0x400558
> instructions count = 1
> last_i_type: OCSD_INSTR_BR
> last_i_subtype: OCSD_S_INSTR_BR_LINK
> last instruction was executed
> last instruction size: 4
> ---------------------------------
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400554
> end addr   = 0x400558
> instructions count = 1
> last_i_type: OCSD_INSTR_BR
> last_i_subtype: OCSD_S_INSTR_BR_LINK
> last instruction was executed
> last instruction size: 4
> 
> the explanation I have for this behavior is that :
> 
> -when setting the software breakpoint, the memory content of the 
> instruction (at 0x400552) was altered to the instruction BKPT,
> 
> -when the breakpoint was hit, the original opcode was set at (0x400552) 
> and a BKPT was set to the next instruction address (0x400554), then the 
> execution was continued
> 
> -when the second breakpoint (0x400554) was hit, the a BKPT opcode was 
> set at (0x400552) and the original opcode was set at (0x400554) then the 
> execution was continued
> 
> I am using the function "int target_read_code (CORE_ADDR memaddr, 
> gdb_byte *myaddr, ssize_t len)" to give program memory content to the 
> decoder. so the collected etm traces are correct, but, as memory was 
> altered in between, the decoder is "cheated".
> 
> I need to identify the re-execution of code due to breakpoint handling, 
> and roll back its impact on etm decoding.
> 
> is there a mean to get the actual content of program memory including 
> patched addresses?

In case this has not been answered yet, there are *_raw_memory functions 
that read the actual contents of memory, without hiding breakpoint 
instructions.

Maybe those will be useful to you?

gdb/target.c:target_read_raw_memory (...)

> 
> is there a means of getting the history of patched addresses during the 
> debugging of a program?

I'm afraid not.

> do you have any other idea for handling this situation?

Breakpoint instructions shouldn't appear in the execution trace history 
I suppose. So maybe just filter out the breakpoint instructions in some way?

You did mention there is some corruption though, so I don't know if 
filtering/adjusting the history will fix the corruption.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-04 16:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-31 23:10 GDB: etm traces decoding and breakpoints for arm targets Zied Guermazi
2020-11-02 11:59 ` Mike Leach
2020-11-02 15:52   ` Zied Guermazi
2020-11-03 11:02     ` Mike Leach
2020-11-04 16:04 ` Luis Machado

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).