From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14701 invoked by alias); 30 Aug 2006 15:20:01 -0000 Received: (qmail 14568 invoked by uid 22791); 30 Aug 2006 15:19:58 -0000 X-Spam-Check-By: sourceware.org Received: from mail.systech.com (HELO mail.systech.com) (207.212.80.162) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 30 Aug 2006 15:19:53 +0000 Received: by mail.systech.com with Internet Mail Service (5.5.2650.21) id ; Wed, 30 Aug 2006 08:15:28 -0700 Message-ID: <74C9525D67A5FF4791614FDB06593BB10286A4@mail.systech.com> From: Jay Foster To: "'jporthouse@toptech.com'" , ecos-discuss-return-35956-jporthouse=toptech.com@ecos.sourceware.org, ecos-discuss-return-35843-jporthouse=toptech.com@ecos.sourceware.org Cc: ecos-discuss@ecos.sourceware.org Date: Wed, 30 Aug 2006 15:20:00 -0000 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" X-IsSubscribed: yes Mailing-List: contact ecos-discuss-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: ecos-discuss-owner@ecos.sourceware.org Subject: RE: [ECOS] Network driver problem only with larger programs (ARM adv needed) X-SW-Source: 2006-08/txt/msg00272.txt.bz2 I am using a JTAG debugger (JEENI) with GDB. The JEENI (and GDB) do not provide any method of accessing the coprocessor registers of the ARM to configure/control the cache. As long as the flash on the board was blank, the board would work because the ARM core would be in the reset/default state (caches disabled) until after I loaded/ran my code. The situation I ran into was that I had already programmed the flash with RedBoot, which boots very quickly - faster than I can start GDB. Therefore, the caches on the board were usually enabled (depending on how quick on the draw I was). On my previous projects (ARM7TDMI) the cache was a unified instruction/data cache (write through) which avoids this issue. With the ARM940T it had separate instruction and data cache (write back). This made all the difference. What is necessary is after loading your program to RAM, you must clean and flush the DCACHE, and flush the ICACHE before executing the code loaded to RAM. When loading the code to RAM, this looks to the ARM core like data writes, and with a write back DCACHE, will be cached (or at least some if it will). The clean and flush of the DCACHE gets this committed to actual RAM. This is necessary because when you execute code from this RAM, it will not already be in the ICACHE (ie, you couldn't have been executing code from the RAM you were loading the program to), so the instructions will be fetched from RAM. If the data cache hasn't been cleaned and flushed, incorrect instructions will be fetched. After a few compile/load/run cycles, the RAM starts to contain old stale fragments causing strange unpredictable results. My half-baked work-around was to write a relocatable assembly stub that flushed the ICACHE/DCACHE, and have my GDB init script poke this into high RAM (an area that my RedBoot marked as non-cacheable for use with DMA devices) and execute the stub to gain control of the cache. Although not perfect, it works reasonably well. For ROM and ROMRAM startup applications, the program starts with the ARM core in the reset/default state, so the caches are always disabled on startup. With RAM startup, it is all up to the 'loader' ('loader' = GDB/JTAG, RedBoot, etc.). The loader must load the RAM code to RAM, then use the HAL_ICACHE_SYNC() macro, then branch to the RAM code entry point. When GDB/JTAG is the 'loader', the equivalent of the HAL_ICACHE_SYNC() macro must be performed by GDB/JTAG debugger. Jay -----Original Message----- From: Joe Porthouse [mailto:jporthouse@toptech.com] Sent: Tuesday, August 29, 2006 8:29 PM To: ecos-discuss-return-35956-jporthouse=toptech.com@ecos.sourceware.org; ecos-discuss-return-35843-jporthouse=toptech.com@ecos.sourceware.org Cc: ecos-discuss@ecos.sourceware.org Subject: RE: [ECOS] Network driver problem only with larger programs (ARM adv needed) Jay, I'm on an xScale (PXA255). Now the ICache...now that seems to make sense! This could explain why the problem seems to come and go with each build. I also just noticed that without my JTAG connected, my build from last night seems to run, but with the JTAG connected, it hangs up. I believe I have also had builds not run even with the JTAG disconnected. Where did you put your HAL_ICACHE_SYNC() call? I am running with a single eCos/Application image (no redboot). The platform specific startup code I had did not support ROMRAM so I had to add the necessary code to copy the image. At boot up my platform_setup1 macro: #1. disables the MMU and cache and sets up SDRAM #2. copies the application from flash (at 0x00000000) into SDRAM (at 0xA0000000) #3. starts the MMU and cache. The MMU is setup to swap flash and SDRAM memory locations (SDRAM at 0x00000000), and execution simply continues from the same PC location, but now it's the SDRAM copy of the application. Oops, I just noticed that when the MMU and cache starts, the ICache is first started and flushed, then the MMU starts, swapping memory locations, followed by the DCache starting and being flushed. Is this order wrong? Or is my problem something that the JTAG or JTAG startup script is doing? For reference I included the platform_setup1 macro, the init_mmu_cache_on macro and my JTAG startup script. Once again any help is greatly appreciated. /********************************************************************** * initialize controller **********************************************************************/ #if defined(CYG_HAL_STARTUP_ROM) || defined(CYG_HAL_STARTUP_ROMRAM) #define PLATFORM_SETUP1 _platform_setup1 #define CYGHWR_HAL_ARM_HAS_MMU #else #define PLATFORM_SETUP1 #endif .macro _platform_setup1 // disable MMU and cache mov r0, #0x78 mcr p15, 0, r0, c1, c0, 0 // invalidate I&D cache & BTB mcr p15, 0, r0, c7, c7, 0 // r0 ignored // drain write (& fill) buffer mcr p15, 0, r0, c7, c10, 4 // r0 ignored CPWAIT r0 // there is only one co processor on the PXA25x ldr r0, =0x0000001 mcr p15, 0, r0, c3, c0, 0 init_sdram_cnt // Wake up from sleep if necessary. wake_from_sleep init_gpio early_uart_init // TOPTECH #if defined(CYG_HAL_STARTUP_ROMRAM) // Relocate [copy] program image from ROM to RAM ldr r3,=0x00000000 // flash start phy addr ldr r4,=0xA0000000 // ram start phy addr ldr r5,=__ram_data_end cmp r4,r5 // jump if no data to move beq 2f sub r3,r3,#4 // loop adjustments sub r4,r4,#4 1: ldr r0,[r3,#4]! // copy info str r0,[r4,#4]! cmp r3,r5 bne 1b 2: #endif LED(11) init_intc_cnt // Interrupt Controller LED(10) init_clks // Clocks LED(9) init_mmu_cache_on // MMU and Cache LED(8) .endm #endif /* CYGONCE_HAL_PLATFORM_SETUP_H */ /********************************************************************** * MMU/Cache **********************************************************************/ .macro init_mmu_cache_on early_uart_out r0, r2, '.' ldr r0, =0x2001 mcr p15, 0, r0, c15, c1, 0 mcr p15, 0, r0, c7, c10, 4 // drain the write & fill buffers CPWAIT r0 mcr p15, 0, r0, c7, c7, 0 // flush Icache, Dcache and BTB CPWAIT r0 mcr p15, 0, r0, c8, c7, 0 // flush instuction and data TLBs CPWAIT r0 early_uart_out r0, r2, '.' // Icache on mrc p15, 0, r0, c1, c0, 0 orr r0, r0, #MMU_Control_I orr r0, r0, #MMU_Control_BTB mcr p15, 0, r0, c1, c0, 0 CPWAIT r0 early_uart_out r0, r2, '.' // Set up a stack [for calling C code] ldr r1, =__startup_stack ldr r2, =PXA2X0_RAM_BANK0_BASE orr sp, r1, r2 // Create MMU tables bl hal_mmu_init early_uart_out r0, r2, '.' // MMU on ldr r2,=1f mrc p15, 0, r0, c1, c0, 0 orr r0, r0, #MMU_Control_M orr r0, r0, #MMU_Control_R mcr p15, 0, r0, c1, c0, 0 mov pc,r2 nop nop nop 1: early_uart_out r0, r2, '.' mcr p15, 0, r0, c7, c10, 4 // drain the write & fill buffers CPWAIT r0 early_uart_out r0, r2, '.' // Dcache on mrc p15, 0, r0, c1, c0, 0 orr r0, r0, #MMU_Control_C mcr p15, 0, r0, c1, c0, 0 CPWAIT r0 early_uart_out r0, r2, '.' // clean/drain/flush the main Dcache mov r1, #0xe0000000 mov r0, #1024 2: mcr p15, 0, r1, c7, c2, 5 add r1, r1, #32 subs r0, r0, #1 bne 2b early_uart_out r0, r2, '.' // clean/drain/flush the mini Dcache mov r0, #64 // number of lines in the mini Dcache 3: mcr p15, 0, r1, c7, c2, 5 // allocate a Dcache line add r1, r1, #32 // increment the address to subs r0, r0, #1 // decrement the loop count bne 3b early_uart_out r0, r2, '.' // flush Dcache mcr p15, 0, r0, c7, c6, 0 CPWAIT r0 // drain the write & fill buffers mcr p15, 0, r0, c7, c10, 4 CPWAIT r0 .endm >// Reset target >reset >// Set endian to little >control.b=0 >// Set semi host variables >_heap_base = A0018000H >_heap_size = 00002000H >_stack_size = 00000400H >_top_of_memory= A0020000H >// Write MSC0, MSC1, MSC2 >// CS0 - Rbuff=0, RRR=010, RDN=0010, RDF=1101, 16bits=0, 000=FLASH >// CS1 - N.C. >// CS2 - Ethernet I/O >// CS3 - NVSRAM >// CS4 - Ethernet >// CS5 - Rbuff=0, RRR=???, RDN=????, RDF=????, 16bits=1, 000=Non Burst >//word 0x48000008 = 0x2ef15af0 // CS1 (N.C.) / CS0 (Flash) >word 0x48000008 = 0x7ff07ff0 // CS1 (N.C.) / CS0 (Flash) >//word 0x4800000c = 0x7ff97ff8 // CS3 (NVSRAM 16) / CS2 (Ethernet I/O 16) >word 0x4800000c = 0x7ff97ffc // CS3 (NVSRAM 16) / CS2 (Ethernet I/O 16 VLIO) >//word 0x48000010 = 0x7ff87ff0 // CS5 (UART 8) / CS4 (Ethernet Data 32) >word 0x48000010 = 0x7ff87ff4 // CS5 (UART 8) / CS4 (Ethernet Data 32 VLIO) >// GPSR2: Set CS3 (GPIO79) high >//word 0x40e00020 = 0x00008000 >// GPDR2: Set CS3 (GPIO79) output >//word 0x40e00014 = 0x00008000 >// GPFR2_L: Set CS3 (GPIO79) CS function >//word 0x40e00064 = 0x80000000 >// GPSR1: Set nPWE (GPIO49) high >word 0x40e0001c = 0x00020000 >// GPSR2: Set CS3 (GPIO78, GPIO79 & GPIO80) high >word 0x40e00020 = 0x0001c000 >// GPDR1: Set nPWE (GPIO49) output >word 0x40e00010 = 0x00020000 >// GPDR2: Set CS3 (GPIO78, GPIO79 & GPIO80) output >word 0x40e00014 = 0x0001c000 >// GAFR0_U: Set RDY (GPIO18) RDY function >word 0x40e00058 = 0x00000010 >// GAFR1_U: Set nPWE (GPIO49) nPWE function >word 0x40e00060 = 0xa0000008 >// GAFR2_L: Set CS2, CS3 (GPIO78, GPIO79) CS function >word 0x40e00064 = 0xa0000000 >// GAFR2_U: Set CS4 (GPIO80) CS function >word 0x40e00068 = 0x00000002 >// Assert MDREFR:K1RUN and MDREFR:K2RUN and configure MDREFR:K1DB2 and >// MDREFR:K2DB2 as desired. >word 0x48000004 = 0x03ca4fff // Controller default >word 0x48000004 = 0x03ca4018 // Refresh1 Rate = (64MS/8192 Rows)*99.5Mhz/32 = 24 >word 0x48000004 = 0x03cf6018 // Set K0RUN, K1RUN and K2RUN >word 0x48000004 = 0x038f6018 // Clear Self Refresh >word 0x48000004 = 0x038ff018 // Set E0PIN and E1PIN >// Set SDRAM config register, but don't enable any banks yet >word 0x48000000 = 0x000009c9 >// write a reg as a delay tactic to wait 200usec >r0 = 0 >// Write the disabled bank 9 times. Each time will cause a CBR refresh >word 0xa0000000 = 0 >word 0xa0000000 = 0 >word 0xa0000000 = 0 >word 0xa0000000 = 0 >word 0xa0000000 = 0 >word 0xa0000000 = 0 >word 0xa0000000 = 0 >word 0xa0000000 = 0 >word 0xa0000000 = 0 >// Enable SDRAM and send MRS to configure SDRAM >word 0x48000000 = 0x000009c9 >word 0x48000040 = 0x00220022 >// GPSR1: Set FF_RXD/GPIO34 High >// GPSR1: Set FF_RXD/GPIO39 High >word 0x40e0001c = 0x00000084 >// GPDR1: Set FF_RXD/GPIO34 Input >// GPDR1: Set FF_RXD/GPIO39 Output >//word 0x40e00010 = 0x00000080 >word 0x40e00010 = 0x00020080 >// GPFR1_L: Set FF_RXD/GPIO34 FF Function 01 >// GPFR1_L: Set FF_RXD/GPIO39 FF Function 10 >word 0x40e0005c = 0x00008010 >// setup FFUART >// disable uart and disable interrupts >word 0x4010000c = 0x00000000 // DLAB off >word 0x40100004 = 0x00000000 >// set baud rate divisor >word 0x4010000c = 0x00000080 // DLAB on >word 0x40100000 = 0x00000008 // 115200 >word 0x40100004 = 0x00000000 // IER_DLH = 0 >// set parameters to 8, n, 1 >word 0x4010000c = 0x00000000 // DLAB off >word 0x4010000c = 0x00000003 // LCR=3 8 bit character >// set polled mode >word 0x40100004 = 0x00000000 >// set normal UART mode >word 0x40100010 = 0x00000000 // MCR = 0 >// enable UART >word 0x40100004 = 0x00000040 >// enable and clear FIFOs >word 0x40100008 = 0x000000c1 >word 0x40100008 = 0x000000c3 >word 0x40100008 = 0x00000005 >// send ABCD >word 0x40100000 = 0x0000000d // CR >word 0x40100000 = 0x0000000a // LF >word 0x40100000 = 0x00000041 >word 0x40100000 = 0x00000042 >word 0x40100000 = 0x00000043 >word 0x40100000 = 0x00000044 Thanks, Joe Porthouse Toptech Systems, Inc. Longwood, FL 32750 -----Original Message----- From: ecos-discuss-owner@ecos.sourceware.org [mailto:ecos-discuss-owner@ecos.sourceware.org] On Behalf Of Jay Foster Sent: Tuesday, August 29, 2006 8:30 PM To: 'jporthouse@toptech.com'; ecos-discuss-return-35843-jporthouse=toptech.com@ecos.sourceware.org Cc: ecos-discuss@ecos.sourceware.org Subject: RE: [ECOS] Network driver problem only with larger programs (ARM adv needed) Which ARM core are you using? I had some similar weird problems with an ARM940T processor that turned out to be cache related. This was particularly tricky with the JTAG debugger. After loading the application code I needed to do the equivalent of the HAL_ICACHE_SYNC() macro to flush the instruction cache, flush and clean the data cache. Then again, your problem might be something completely different. Jay -----Original Message----- From: Joe Porthouse [mailto:jporthouse@toptech.com] Sent: Tuesday, August 29, 2006 2:02 PM To: ecos-discuss-return-35843-jporthouse=toptech.com@ecos.sourceware.org Cc: ecos-discuss@ecos.sourceware.org Subject: RE: [ECOS] Network driver problem only with larger programs (ARM adv needed) I enabled asserts and stack checking and the problem stopped. I then turned off asserts and stack checking and the problem did not reoccur...until today. Now with asserts and stack checking enabled I get no errors, but the execution still gets hung up in the cyg_do_net_init() call from the cyg_hal_invoke_constructors() routine. Using breakpoints and the traceback feature of my JTAG I can see exactly where things go wrong, but don't know why. All constructors get called correctly until the cyg_do_net_init is called. When this occurs execution gets two instructions into the procedure and then jumps into the middle of the cyg_timeout() function where it enters an endless loop. Checking addresses and registers everything looks ok (to me). I have even tried this on three different pieces of hardware. I am at a complete loss on why this is occurring. I can step through the same piece of code in a small program and execution occurs as expected. Any advice would be greatly appreciated. Trace leading up to the offending instruction looks like: hal_misc.c Line 202 (cyg_hal_invoke_constructors) 202 (*p) (); 000E937C e1a0e00f MOV LR,PC TRIG 000E9380 e414f004 LDR PC,[R4],#-004 // jump from here 001007C8 e52de004 STR LR,[SP,#-004]! // to here, ok! Registers at this point are: R0 00008000 R1 00000004 R2 003d940c R3 0037d0fc R4 0037d85c <- constructor table address, good R5 0037d848 R6 0b0b0b0b R7 0b0b0b0b R8 00000000 R9 a0003000 R10 0010032c R11 0037f00c R12 003d940c SP 0037eff8 LR 000e9384 PC 001007cc <- PC jumped to correct address, now at 2nd address CPSR 200000d3 SPSR 000000d3 Execution should follow the listing as: _GLOBAL__I.52100_cyg_do_net_init: 001007C8 e52de004 STR LR,[SP,#-004]! <- jumped here ok. 001007CC e3a01ccb MOV R1,#0000cb00 <- PC now here. 001007D0 e2811084 ADD R1,R1,#00000084 001007D4 e3a00001 MOV R0,#00000001 001007D8 e49de004 LDR LR,[SP],#004 001007DC eafffff1 B _Z41__static_initialization_and_destruction_0ii But on the next step execution jumps into timeout() at address 00100330: 262 cyg_uint32 263 timeout(timeout_fun *fun, void *arg, cyg_int32 delta) 264 { cyg_timeout: 00100308 e1a0c00d MOV R12,SP 0010030C e92dddf0 STMFD SP!,{R4-R8,R10-R12,LR,PC} 00100310 e24cb004 SUB R11,R12,#00000004 00100314 e1a07002 MOV R7,R2 00100318 e1a08000 MOV R8,R0 0010031C e1a0a001 MOV R10,R1 265 int i; 266 timeout_entry *e; 267 cyg_uint32 stamp; 268 269 // this needs to be atomic - recursive calls from the alarm 270 // handler thread itself are allowed: 271 int spl = cyg_splinternal(); 00100320 ebfffd88 BL cyg_splinternal 274 for (e = _timeouts, i = 0; i < NTIMEOUTS; i++, e++) { 00100324 e59f4060 LDR R4,0010038c 272 273 stamp = 0; // Assume no slots available 00100328 e3a05000 MOV R5,#00000000 0010032C e1a06000 MOV R6,R0 00100330 e1a02005 MOV R2,R5 <- WHY ARE WE HERE NOW??? 275 if ((e->flags & CALLOUT_PENDING) == 0) { 00100334 e5943014 LDR R3,[R4,#014] 00100338 e2822001 ADD R2,R2,#00000001 0010033C e3130004 TST R3,#00000004 00100340 0a000006 BEQ cyg_timeout+58 00100344 e3520007 CMP R2,#00000007 00100348 e2844018 ADD R4,R4,#00000018 0010034C dafffff8 BLE cyg_timeout+2c 282 } 283 } Joe Porthouse Toptech Systems, Inc. Longwood, FL 32750 -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss