From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Garnett To: ecos-discuss@sourceware.cygnus.com Subject: Re: [ECOS] eCos FP support suggestions. Date: Wed, 27 Oct 1999 08:15:00 -0000 Message-id: References: <87puz4zfno.fsf@osv.javad.ru> <87aepqx7mx.fsf@osv.javad.ru> X-SW-Source: 1999-10/msg00100.html Sergei Organov writes: Sergei, Sorry that it's taken me a bit of time to get back to this. > Nick, > > I already heard before from Bart about the idea to decide if thread > needs FP context by handling "FP unavailable" exception and thus > don't add additional support in the HAL interface. This approach has > its drawbacks, and I think that it should be taken only if (even > backward compatible) changes in HAL API are strictly prohibited. > > I believe that it could be implemented in the way you described. > However, here are things that bother me in this approach: > > 1. What to do if architecture appears where "FP enable bit" just > doesn't exist and thus there is no way to get exception on first FP > instruction? In this case we really have no alternative but to assume that all threads are FP-using and switch FP state on each context switch. Unfortunately we cannot really rely on the user telling us which threads use FP and which don't since the issue is often orthogonal to the threadedness of the application. It is not always easy to cleanly divide the code of the app into FP and non-FP parts and ensure that threads stay in their own halves. However, this is largely an academic issue, most of the architectures we support have this facility, since their designers expects operating systems to switch FPU state lazily using exactly the mechanism I propose. > > 2. Anyway it'd be fine to have a way to define non-FP task > explicitly (and get "FP not available" exception if FP operation is > used). It will also allow to don't have "static" FP area at the base > of the stack for such light-weight task. This would be a reasonable optional enhancement to the basic mechanism. However, see my comments about the impact on the kernel interfaces later. > > 3. Porting of FP support to new targets seems to be more difficult > with this approach, because all common logic (that is in turn more > complex) is to be implemented in the HAL instead of kernel. > There is not really very much common code here. Nearly all of it has to be implemented in assembler for performance and because it works very close to the machine. Even in your design, the common code really only consists of a few per-thread variables and a few tests, nothing very complex. With a suitable model to code from, porting to a new architecture is often a simple matter of just translating instruction for instruction. I have done this many times and it is very easy. Often it is cleaner and simpler to reimplement a small piece of common code than to provide a more complex interface simply so that it may be shared. Also, remember that the whole thing does not need to be implemented from the start. It is acceptable to implement only option 1 and to add the others as enhancements as they are required. > 4. Potentially time-consuming operations of handling exception and > initializing of FP context occurs at hardly predictable time moments > (when first FP instruction is executed) instead of well defined moment > of task creation. FPU exception handling should not be very expensive. Beyond the code to save and restore the FPU contexts, it should not be more than a handful of instructions. Remeber, in their "real" incarnations as workstation CPUs, these processors are doing this kind of thing all the time and hardware support for these exceptions is quite slick. I left many details out of my description of how things work for simplicity. When I talk about initializing the FPU, this may just mean loading an FPU context full of zeroes. However, for many architectures it may be cheaper to do an FPU initialization (which may just be to load zeroes into all the registers) than to load an FPU context, so we should do this when the option is available (such as first use of FP by a thread). Initializing the static per-thread FP context will happen at thread initialization. It should be set up so that it can just be loaded into the FPU as if it were a saved context, or marked invalid if it is cheaper to initializate the FPU directly. I agree that having any work done in the FPU exception handler makes the first FPU instruction after a context switch take a much longer time. However, this does have a fixed maximum duration (the time to save and load a whole FPU context plus a few instructions). Any mechanism that either lazily switches FPU contexts, or allows threads to be optionally FP using or not, will introduce non-determinism. This is true whether the switch is done in an FPU exception handler or the context switch code. The only way of ensuring determinism is to switch FPU contexts on every thread switch and reckon the extra time for this into your calculations as a constant overhead. > > 5. Are there any benefits of this approach besides unchanged HAL > interface? Programmer doesn't need to decide if particular task needs > FP context. What else? > We would also have to extend or change the kernel interface, since it would be necessary to either specifiy that a thread was FP using on creation, or notify the kernel of that fact later. This would either require the constructor for the Cyg_Thread class to be changed, or some new member functions to be added. These changes would then have to be reflected in the C API. All of this would have major effects on existing code and documentation. In general I want to avoid having to make such far-reaching changes if an alternative solution exists. -- Nick Garnett mailto:nickg@cygnus.co.uk Cygnus Solutions, UK http://www.cygnus.co.uk