From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19169 invoked by alias); 5 Feb 2013 16:57:20 -0000 Received: (qmail 19145 invoked by uid 22791); 5 Feb 2013 16:57:17 -0000 X-SWARE-Spam-Status: No, hits=-3.3 required=5.0 tests=AWL,BAYES_40,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE X-Spam-Check-By: sourceware.org Received: from mail-ob0-f173.google.com (HELO mail-ob0-f173.google.com) (209.85.214.173) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 05 Feb 2013 16:57:13 +0000 Received: by mail-ob0-f173.google.com with SMTP id dn14so383180obc.4 for ; Tue, 05 Feb 2013 08:57:12 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.182.26.114 with SMTP id k18mr18850987obg.94.1360083432468; Tue, 05 Feb 2013 08:57:12 -0800 (PST) Received: by 10.76.168.36 with HTTP; Tue, 5 Feb 2013 08:57:12 -0800 (PST) Date: Tue, 05 Feb 2013 16:57:00 -0000 Message-ID: From: Elad Yosef To: eCos Discussion , ecos-discuss@sources.redhat.com Content-Type: text/plain; charset=ISO-8859-1 X-IsSubscribed: yes Mailing-List: contact ecos-discuss-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: ecos-discuss-owner@ecos.sourceware.org Subject: [ECOS] Scheduler problem? X-SW-Source: 2013-02/txt/msg00005.txt.bz2 Hi all, I'm running benchmark of my application and after some time one of the threads (named MIRS) stops running. All other threads keep running (I get pings and the benchmark runs on the other threads) I have t he ASSERTS and TRACES enabled. The Scheduler is not configured with time-slicing. I have implemented CLI command to get the threads info by "get_next" and "get_info" and added the TRACE DUMP and also This is the output in normal mode Thread # 1(Idle Thread) state(0) stack on 0x80045418 length 1536. Max usage - 1396(90%) Thread # 2(COM-TX) state(1) stack on 0x80039690 length 2560. Max usage - 1400(54%) Thread # 3(COM-RX) state(1) stack on 0x8003a090 length 2560. Max usage - 1616(63%) Thread # 4(CLI ) state(0) stack on 0x8003e290 length 3584. Max usage - 2112(58%) Thread # 5(MODEM1) state(1) stack on 0x8003cc90 length 2816. Max usage - 1312(46%) Thread # 6(RF1 ) state(1) stack on 0x8003b690 length 2816. Max usage - 1848(65%) Thread # 7(MODEM2) state(1) stack on 0x8003d790 length 2816. Max usage - 1312(46%) Thread # 8(RF2 ) state(1) stack on 0x8003c190 length 2816. Max usage - 1928(68%) Thread # 9(MIRS ) state(1) stack on 0x8003aa90 length 3072. Max usage - 1648(53%) Thread #10(TCP ) state(1) stack on 0x8004b768 length 2048. Max usage - 1600(78%) Thread #11(ETH ) state(1) stack on 0x8004bf68 length 2048. Max usage - 1328(64%) Scheduler: Lock: 0 Current Thread: CLI Threads: Idle Thread pri = 31 state = R id = 1 stack base = 80045418 ptr = 80045860 size = 00000600 sleep reason NONE wake reason NONE queue = 00000000 wait info = 00000000 COM-TX pri = 7 state = S id = 2 stack base = 80039690 ptr = 80039f30 size = 00000a00 sleep reason WAIT wake reason NONE queue = 8004e51c wait info = 80039fd0 COM-RX pri = 15 state = S id = 3 stack base = 8003a090 ptr = 8003a838 size = 00000a00 sleep reason WAIT wake reason NONE queue = 8004b228 wait info = 00000000 CLI pri = 20 state = R id = 4 stack base = 8003e290 ptr = 8003ebd8 size = 00000e00 sleep reason NONE wake reason DONE queue = 00000000 wait info = 00000000 MODEM1 pri = 9 state = S id = 5 stack base = 8003cc90 ptr = 8003d5a8 size = 00000b00 sleep reason WAIT wake reason NONE queue = 8004e504 wait info = 00000000 RF1 pri = 9 state = S id = 6 stack base = 8003b690 ptr = 8003bcf0 size = 00000b00 sleep reason DELAY wake reason NONE queue = 00000000 wait info = 00000000 MODEM2 pri = 9 state = S id = 7 stack base = 8003d790 ptr = 8003e0a8 size = 00000b00 sleep reason WAIT wake reason NONE queue = 8004e514 wait info = 00000000 RF2 pri = 9 state = S id = 8 stack base = 8003c190 ptr = 8003c990 size = 00000b00 sleep reason WAIT wake reason NONE queue = 8004e50c wait info = 00000000 MIRS pri = 11 state = S id = 9 stack base = 8003aa90 ptr = 8003b510 size = 00000c00 sleep reason WAIT wake reason NONE queue = 8004e4f4 wait info = 8003b5b0 TCP pri = 7 state = S id = 10 stack base = 8004b768 ptr = 8004bdc0 size = 00000800 sleep reason TIMEOUT wake reason NONE queue = 8004b668 wait info = 00000000 ETH pri = 6 state = S id = 11 stack base = 8004bf68 ptr = 8004c618 size = 00000800 sleep reason WAIT wake reason NONE queue = 80033f98 wait info = 00000000 And this is the output in the faulty mode Thread # 1(Idle Thread) state(0) stack on 0x80045418 length 1536. Max usage - 1396(90%) Thread # 2(COM-TX) state(1) stack on 0x80039690 length 2560. Max usage - 1400(54%) Thread # 3(COM-RX) state(1) stack on 0x8003a090 length 2560. Max usage - 1616(63%) Thread # 4(CLI ) state(0) stack on 0x8003e290 length 3584. Max usage - 2112(58%) Thread # 5(MODEM1) state(1) stack on 0x8003cc90 length 2816. Max usage - 1312(46%) Thread # 6(RF1 ) state(1) stack on 0x8003b690 length 2816. Max usage - 1872(66%) Thread # 7(MODEM2) state(1) stack on 0x8003d790 length 2816. Max usage - 1312(46%) Thread # 8(RF2 ) state(1) stack on 0x8003c190 length 2816. Max usage - 1928(68%) Thread # 9(MIRS ) state(0) stack on 0x8003aa90 length 3072. Max usage - 1648(53%) Thread #10(TCP ) state(1) stack on 0x8004b768 length 2048. Max usage - 1600(78%) Thread #11(ETH ) state(1) stack on 0x8004bf68 length 2048. Max usage - 1328(64%) Scheduler: Lock: 0 Current Thread: CLI Threads: Idle Thread pri = 31 state = R id = 1 stack base = 80045418 ptr = 80045860 size = 00000600 sleep reason NONE wake reason NONE queue = 00000000 wait info = 00000000 COM-TX pri = 7 state = S id = 2 stack base = 80039690 ptr = 80039f30 size = 00000a00 sleep reason WAIT wake reason NONE queue = 8004e51c wait info = 80039fd0 COM-RX pri = 15 state = S id = 3 stack base = 8003a090 ptr = 8003a838 size = 00000a00 sleep reason WAIT wake reason NONE queue = 8004b228 wait info = 00000000 CLI pri = 20 state = R id = 4 stack base = 8003e290 ptr = 8003ebd8 size = 00000e00 sleep reason NONE wake reason DONE queue = 00000000 wait info = 00000000 MODEM1 pri = 9 state = S id = 5 stack base = 8003cc90 ptr = 8003d5a8 size = 00000b00 sleep reason WAIT wake reason NONE queue = 8004e504 wait info = 00000000 RF1 pri = 9 state = S id = 6 stack base = 8003b690 ptr = 8003be90 size = 00000b00 sleep reason WAIT wake reason NONE queue = 8004e4fc wait info = 00000000 MODEM2 pri = 9 state = S id = 7 stack base = 8003d790 ptr = 8003e0a8 size = 00000b00 sleep reason WAIT wake reason NONE queue = 8004e514 wait info = 00000000 RF2 pri = 9 state = S id = 8 stack base = 8003c190 ptr = 8003c990 size = 00000b00 sleep reason WAIT wake reason NONE queue = 8004e50c wait info = 00000000 MIRS pri = 11 state = R id = 9 stack base = 8003aa90 ptr = 8003b510 size = 00000c00 sleep reason NONE wake reason DONE queue = 00000000 wait info = 8003b5b0 TCP pri = 7 state = S id = 10 stack base = 8004b768 ptr = 8004bdc0 size = 00000800 sleep reason TIMEOUT wake reason NONE queue = 8004b668 wait info = 00000000 ETH pri = 6 state = S id = 11 stack base = 8004bf68 ptr = 8004c618 size = 00000800 sleep reason WAIT wake reason NONE queue = 80033f98 wait info = 00000000 Does anyone can advise? Should I suspect the priority inversion protocols ? Are there more tools that I can use to find the root cause? Thanks Elad -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss