From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28004 invoked by alias); 4 Mar 2014 15:58:54 -0000 Mailing-List: contact ecos-discuss-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: ecos-discuss-owner@ecos.sourceware.org Received: (qmail 27991 invoked by uid 89); 4 Mar 2014 15:58:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2 X-HELO: p02c12o142.mxlogic.net Received: from p02c12o142.mxlogic.net (HELO p02c12o142.mxlogic.net) (208.65.145.75) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 04 Mar 2014 15:58:52 +0000 Received: from unknown [12.218.215.72] (EHLO smtpauth1.linear.com) by p02c12o142.mxlogic.net(mxl_mta-7.2.4-1) with ESMTP id a38f5135.0.17814.00-363.49119.p02c12o142.mxlogic.net (envelope-from ); Tue, 04 Mar 2014 08:58:52 -0700 (MST) X-MXL-Hash: 5315f83c00a4442f-4994548221e43aede93895f0be7541dd22bbf89a Received: from alabar.engineering.linear.com (unknown [10.186.3.96]) by smtpauth1.linear.com (Postfix) with ESMTPSA id 8191D740C7; Tue, 4 Mar 2014 07:58:46 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) From: Michael Jones In-Reply-To: <5315F31D.9060007@stmi.com> Date: Tue, 04 Mar 2014 15:58:00 -0000 Cc: =?iso-8859-1?Q?Lambrecht_J=FCrgen?= , ecos discuss Content-Transfer-Encoding: quoted-printable Message-Id: <55AB601E-FC02-4510-B3A7-C1970FA2E187@linear.com> References: <496B24D9-62B6-48F2-BD53-1F6B9ABE2083@linear.com> <5315F31D.9060007@stmi.com> To: christophe X-AnalysisOut: [v=2.0 cv=BZRvJMR2 c=1 sm=1 a=glloKNylpeYNumXQcclYyA==:17 a] X-AnalysisOut: [=CFl6StMSI_gA:10 a=D2_GN2MmYMYA:10 a=BLceEmwcHowA:10 a=8nJ] X-AnalysisOut: [EP1OIZ-IA:10 a=MqDINYqSAAAA:8 a=Tfy3TMlvAAAA:8 a=FP58Ms26A] X-AnalysisOut: [AAA:8 a=prgfXTrUAAAA:8 a=CCpqsmhAAAAA:8 a=kKQStS77CAJJbo2U] X-AnalysisOut: [edgA:9 a=wPNLvfGTeEIA:10 a=FuO2q78TzEcA:10 a=ntesgRjRzHUA:] X-AnalysisOut: [10 a=xLpt9-x9cSEA:10 a=4t78-hnhQh4A:10 a=AwMT2Lvn3XO_NCXp:] X-AnalysisOut: [21 a=ri1msRunu2uXjYw9:21] X-Spam: [F=0.5000000000; CM=0.500; MH=0.500(2014030414); S=0.200(2010122901)] X-MAIL-FROM: X-IsSubscribed: yes Subject: Re: [ECOS] Scheduler startup question X-SW-Source: 2014-03/txt/msg00002.txt.bz2 Christophe, When I first got SMP to work I added some code in interrupt_end to take the= lock, but I moved it back to Vectors.S because I was trying to reduce chan= ges to the kernel. Functionally, the only difference is getting the lock be= fore the ISR is executed or not. My bigger concern is how the lock is taken. When I increase the lock count,= the core doing so (core 0) may not be the holder of the lock, which leads = to assertions. And if it spins while taking the lock, it deadlocks. I have = not traced down the deadlock, but I think the problem is in the scheduler, = where some secondary CPU is waiting. My current solution is to use a trylock in Vectors.S and living with the fa= ct that when it fails, it will take another real time clock interrupt to tr= y again. So interrupt_end is not guaranteed to called on each interrupt. Th= is keeps things simple. All interrupts go to core 0 except inter cpu interr= upts. Some latency is added because taking the lock is not guaranteed. Other ways to handle this is to send interrupts to all cores, use inter cor= e interrupts, etc, in an effort to guarantee a lock is incremented by the c= ore that holds the lock. I was not able to figure our how i386 handled this. Does anyone know how th= e i386 SMP incremented the lock if the core that got the interrupt did not = hold the lock? Mike On Mar 4, 2014, at 8:37 AM, christophe wrote: > Hi Michael, >=20 > I might remember wrong but I think in case of SMP target, the lock is not= taken in Vector.S but directly after entering interrupt_end. Of course thi= s is spinlock based so it might delay posting/scheduling of the DSR. >=20 > Christophe >=20 > On 3/2/2014 9:19 PM, Michael Jones wrote: >> Jurgen, >>=20 >> I think I fully understand how the scheduler locking works during interr= upt now. Vectors.S takes the lock, and interrupt_end clears it. However, th= e normal technique of incrementing the lock count does not work with SMP. T= he problem is that another CPU may have the lock. Incrementing anyway leads= to assertions. Attempting to take the lock with the spinlock can lead to d= eadlocks or an unresponsive network application. >>=20 >> So I changed things so that in Vectors.S, during an interrupt, an attemp= t at locking is made. This means trying to take a spinlock that might fail.= If the lock is taken, interrupt_end is called. If the lock fails, interrup= t_end is not called. >>=20 >> This means that a DSR may not be posted on that interrupt. This can caus= e some latency based on the real time clock interrupt rate, or time until a= thread switch. However, it is stable and assertion free. Also, a HAL could= implement a timeout on the try spinlock which might reduce latency. >>=20 >> To support the try and testing if the lock was taken, I had to add some = functions to the kernel. The following wiki page has been updated to reflec= t the kernel changes. >>=20 >> https://sourceforge.net/p/ecosfreescale/wiki/SMP%20Kernel/ >>=20 >> Anyone with SMP knowledge might want to take a look. There may be better= solutions to some of these problems. But at least for now, the IMX6 SMP HA= L seems stable and I can run IO intensive Lua scripts over telnet reliably,= even when the client aborts. >>=20 >> The client abort means telnet has to kill a thread. This was quite a cha= llenge. Telnet is creating a separate heap for Lua so it can kill the threa= d and reclaim memory. The remaining problem is closing file handles. I stil= l get some assertions when a handle is sometimes killed by a thread that do= es not own it. I don't think that can be solved without adding some new fun= ctions dedicated to clean up of file handles by an outside thread. >>=20 >> Mike >>=20 >>=20 >>=20 >> On Feb 26, 2014, at 11:40 PM, Lambrecht J=FCrgen wrote: >>=20 >>> As far as I know the scheduler is started after cyg_user_start(), used = by your application to initialize everything. Do you use cyg_user_start? >>>=20 >>>=20 >>> Verzonden vanaf Samsung Mobile >>>=20 >>>=20 >>>=20 >>> -------- Oorspronkelijk bericht -------- >>> Van: Michael Jones >>> Datum: >>> Aan: ecos discuss >>> Onderwerp: [ECOS] Scheduler startup question >>>=20 >>>=20 >>> I have a question about proper scheduler locking startup behavior. >>>=20 >>> The context is I am cleaning up my iMX6 HAL and attempting to make thin= gs work without a couple of kernel hacks I added to make it work. >>>=20 >>> The question has to do with sched_lock. By default this has a value of = 1, so during startup the scheduler is locked. >>>=20 >>> When there is an interrupt, sched_lock is incremented in Vectors.S, and= decremented in interrupt_end. >>>=20 >>> However, I am getting an assert in sync.h which is part of the BSD stac= k. The assert is because it expects the lock to be zero. >>>=20 >>> The question is, during the startup process, how does the lock get set = to zero after initialization? Is it supposed to stay 1 while hardware is in= itialized and through all the constructors, etc? Is it cleared by the sched= uler somehow? Is the HAL supposed to zero it at some point during startup? >>>=20 >>> My HAL is part of the ARM hal, so if this is device specific, it is the= ARM HAL I am working with. >>>=20 >>> Mike >>> -- >>> Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos >>> and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss >>>=20 >>>=20 >>> -- >>> Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos >>> and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss >>>=20 >>=20 >=20 >=20 > --=20 > Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos > and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss >=20 -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss