From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5375 invoked by alias); 6 Oct 2009 13:51:25 -0000 Received: (qmail 5359 invoked by uid 22791); 6 Oct 2009 13:51:23 -0000 X-SWARE-Spam-Status: No, hits=0.7 required=5.0 tests=AWL,BAYES_50,SPF_PASS,TBC X-Spam-Check-By: sourceware.org Received: from hagrid.ecoscentric.com (HELO mail.ecoscentric.com) (212.13.207.197) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 06 Oct 2009 13:51:19 +0000 Received: from localhost (hagrid.ecoscentric.com [127.0.0.1]) by mail.ecoscentric.com (Postfix) with ESMTP id BD9FD2F78014; Tue, 6 Oct 2009 14:51:16 +0100 (BST) Received: from mail.ecoscentric.com ([127.0.0.1]) by localhost (hagrid.ecoscentric.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RNsnSutFQt9m; Tue, 6 Oct 2009 14:51:13 +0100 (BST) Message-ID: <4ACB4B58.2040804@ecoscentric.com> Date: Tue, 06 Oct 2009 13:51:00 -0000 From: Ross Younger User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: Jonathan Larmour CC: eCos developers Subject: Re: NAND technical review References: <4AC6218C.20407@jifvik.org> In-Reply-To: <4AC6218C.20407@jifvik.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact ecos-devel-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: ecos-devel-owner@ecos.sourceware.org X-SW-Source: 2009-10/txt/msg00001.txt.bz2 Jonathan Larmour wrote: > I think at first the ball is really in Ross/eCosCentric's court to give > the technical rationale for the decision, so I'd like to ask him first > to give his rationale and his own perspective of the comparison of the > pros/cons. Here goes with a comparison between the two in something close to their current states (my 26/08 push to bugzilla 1000770, and Rutger's r659). For brevity, I will refer to the two layers as "E" (eCosCentric) and "R" (Rutger) from time to time. Note that this is only really a comparison of the two NAND layers. I have not attempted to compare the two YAFFS porting layers, though I do mention them in a couple of places where it seemed relevant. BTW: I will be off-net tomorrow and all next week, so please don't think I am ignoring the discussion... 1. NAND 101 ------------------------------------------------------------- (Those familiar with NAND chips can skip this section, but I appreciate that not everybody on-list is in the business of writing NAND device drivers :-) ) (i) Conceptual A chip comprises a number of blocks (a round power of two). Each block comprises a number of pages (another power of two). Each page has a "main" data area (512 or 2048 bytes on current devices) and a "spare" - aka out-of-band or OOB - area (16 or 64 bytes respectively). It's up to the driver and application to decide how they will use the spare area, but it's usual for some of it to be given over to storing ECC data, and there is space for a factory-bad marker (see below). Programming the chip must be performed a page at a time (sometimes a 512 byte subpage). Erasing must be performed a whole block at a time. By way of illustration, in the chip spec sheet I have to hand (Samsung K9F1G08 series): * 1 page = 2k byte + 64 spare * 1 block = 64 pages * The whole chip has 1024 blocks, making for 128MB (1Gbit) of data and 4MB (32Mbit) of spare area. Now, I mentioned ECC data. NAND technology has a number of underlying limitations, importantly that it has reliability issues. I don't have a full picture - the manufacturers seem to be understandably coy - but my understanding is that on each page, a driver ought to be able to cope with a single bit having flipped either on programming or on reading. The recommended way to achieve this is by storing an ECC in the spare area: the algorithm published by Samsung is popular, requiring 22 bits of ECC per 256 bytes of data and able to correct a 1 bit error and detect a 2 bit error. There is also the question of bad blocks. Again, full details are sketchy. A chip may be shipped with a number of "factory-bad" blocks (e.g. up to 20 on this Samsung chip); they are marked as such in their spare area. (What constitutes a "bad" block is not published; one imagines that the factory have access to more test information than users do and that there may be statistical techniques involved in judging the likely reliability of the block.) Blocks may also fail during the life of the device, usually by the chip reporting a failure during a program or erase operation. Because of this, the manufacturers recommend that chip drivers scan the device for factory-bad markers then create and maintain a Bad Block Table throughout the life of of the device. How this is done is not prescribed, but the behaviour of the Linux MTD layer is something approximating a de facto standard. (ii) Chip comms protocol Getting data into and out of the chip involves a simple protocol sequence. Commands are single bytes; addresses are sequences of a few bytes depending on the chip size and the operation invoked. For example, to read a page of data on the spec sheet I have to hand is: * Write 0x00 into the command latch * Write the four address bytes in turn into the address latch * Write 0x30 into the command latch * Chip signals Busy; wait for it to signal Ready * Read out (up to) 2112 bytes of data. However, not all chips are quite the same. The ONFI initiative is an attempt to standardise chip protocols and most new chips should comply with it. A number of chips on the market are _nearly_ ONFI-compliant: deviations typically occur over the format of the ReadID response and that of an address. I believe that older chips did their own thing entirely. (iii) Electrical Most, if not all, NAND chips have the same broad electrical interface. There is a master Chip Enable line; nothing happens if this is not active. Data flows into and out of the chip via its data bus, which is 8 or 16 bits wide, mediated by Read Enable and Write Enable lines. Commands and addresses are sent on the data bus, but routed to the appropriate latches by asserting the Address Latch Enable or Command Latch Enable lines at the same time. There is also a ready/busy line which the driver can use to tell when an operation is in progress. Typical operation times from the Samsung spec sheet I have to hand are 25us for a page read, 300us for a page program, and 2ms for a block erase. (iv) Board hook-up What's more interesting is how the lines are hooked up to the board. It is quite commonplace for a board based on a SoC to make good use of an onboard memory controller or dedicated NAND controller. This allows the controller to be programmed with the electrical profile the chip expects, which makes life easy for the device driver: often, you just have to write bytes to the relevant MMIO register address as fast as you wish and the controller takes care of the rest. If the NAND lines are connected to the CPU only as GPIO, the driver has a lot of work to do in conforming to the correct signal profile at every step of the chip protocol. (I haven't had to produce such a port, and I don't think Rutger has needed one either, though he has produced an untested example driver.) In the case of a dedicated NAND controller, it is common to provide hardware-assistance for ECC calculation. Where available, this provides a significant speed-up (about 40% per page in my benchmarking). Sometimes the ready/busy line isn't wired in or requires a jumper to be set to route it. This can be worked around: for a read operation, one can just insert a delay loop for the prescribed maximum time, while for programs and erases, most (all?) chips have a "Read Status" command which can be used to query whether the operation has completed. It can be beneficial to be able to set up the ready/busy line as an interrupt source, as opposed to having to poll it. Whilst there is an overhead involved in context-switching, if other application threads have much to do it may be advantageous overall for the thread waiting for the NAND to sleep until woken by interrupt. Of course, it is possible to put multiple chips on a board. In that case there needs to be a way to route between them; I would expect this to be done with the Chip Select line, addressed either by different MMIO addresses or a separate GPIO or CPLD step. Theoretically, multiple chips could be hooked up in parallel to give something that looks like a 16 or 32-bit "wide" chip, but I have never encountered this in the NAND world, and it would impose a certain extra level of complexity on the driver. 2. Application interface ----------------------------------------------- Both layers have broadly similar application interfaces. In both layers, an application must first use a `lookup' call which provides a pointer to a device context struct. In Rutger's layer, devices are identified by device number; in eCosCentric's, by a textual name set in the board HAL. Both layers provide a means of finding out about the device. R's provides a call which returns an info block; E's provides macros which retrieve information from the device struct (which may also be queried directly). The basic operations required are reading a page, programming a page and erasing a block, and both layers provide these. The page-oriented operations optionally allow read/write of the page spare area. These operations also automatically calculate and check an ECC, if the device has been configured to do so. Rutger's layer has an extra hook in place where an application may explicitly request the use of cached reading and writing where the device supports this. Both layers also support the necessary ancillary operations of querying the status of a block in the bad-block table, and marking a block as bad. (a) Partitions E's application interface also provides logic implementing partitions. That is to say, all access to a NAND array must be via a `partition'; the NAND layer sanity-checks whether the requested flash page or block address is within the given partition. This is quite a lightweight layer and hasn't added much overhead of either code footprint or execution time. The presence of partitions in E's model was controversial, as are its fine details. Nevertheless, some notion of partitioning turns out to be essential on some boards. In some recent work for a customer we identified three separate regions of NAND: somewhere to put the boot loader (primary, as booted by ROM, and RedBoot), somewhere for the application image itself (perhaps FIS-like rather than a full filesystem), and a filesystem for the application to use as it pleases. R's interface does not have such a facility. It appears that, in the event that the flash is shared between two or more logical regions, it's up to higher-level code to be configured with the correct block ranges to use. (b) Dynamic memory allocation R's layer mandates the provision of malloc and free, or compatible functions. These must be provided to the cyg_nand_init() call. E's doesn't; instead it declares a small number of static buffers. Andrew Lunn opined on 6/3/09 that R's requirement for malloc is not a major issue because the memory needs of that layer are well-bounded; I think I broadly agree, though the situation is not ideal in that it forces somebody who wants to use a lean, mean eCos configuration to work around. Also note that if you're going to run a full file system like YAFFS, you can't avoid needing malloc, but in an application making simpler use of NAND, it's an overhead that you may prefer to avoid. 3. Driver model -------------------------------------------------------- The major architectural difference between the two NAND layers is in their driver models and the degree of abstraction enforced. In Rutger's layer, controllers and chips are both formally abstracted. The application talks to the Abstract NAND Chip, which has (hard-coded) the basic sequences of commands, addresses and data required to talk to a NAND chip. This layer talks to a controller driver, which provides the nuts and bolts of reading and writing to the device. The chip driver is also called by the ANC layer, and provides the really chip-specific parts. The call flow looks something like this (best viewed in fixed-width font): Application --(H)-> ANC --(L)-> Controller driver \ \-(C)-> Chip driver H: high-level interface (read page, program page, erase block; chip (de)selection) L: low-level interface (read/write commands, addresses, data; query the busy line) C: chip-specific details (chip init, parse ReadID, query factory-bad marker) In eCosCentric's layer, a NAND driver is a single abstraction covering chip init and querying the factory-bad status as well as the high level functions (reading a page, etc). It is left to the driver to determine the sequence of commands to send. How the driver interacts with the device is considered to be a contract only between the driver and the relevant platform HAL, so is not formally abstracted by the NAND layer. E's chip drivers are written as .inl files, intended to be included by the relevant platform HALs by whichever source file provides the required low-level functions. The lack of a formal abstraction is an attempt to provide a leaner and meaner experience at runtime: the low-level functions can be (and indeed are, so far) provided as static inlines. The flow looks like this: Application --(H1)-> NAND layer --(H2)-> NAND driver --(L*)-> Platform HAL H1: high-level calls (read page, program page, erase block) H2: high-level calls (as H1, plus device init and query factory-bad marker) L*: low-level calls, like L above but not formally abstracted The two models have pros and cons in both directions. - As hinted at above, the static inline model of E's low-level access functions is expected to turn out to have a lower function call (and, generally, code size) overhead than R's. - R's model shares the command sequence logic amongst all chips, differentiating only between small- and large-page devices. (I do not know whether this is correct for all current chips, though going forwards seems less likely to be an issue as fully-ONFI-compliant chips become the norm.) If multiple chips of different types are present in a build, E's model potentially duplicates code (though this could be worked around; also, an ONFI driver ought to be written). - A corollary of arguably inconsequential import: R's model forces the synth driver to emulate an entire NAND chip and its protocol. E's synth doesn't need to. - E's high-level driver interface makes it harder to add new functions later, necessitating a change to that API (H2 above). R's does not; the requisite logic would only need to be added to the ANC. It is not thought that more than a handful such changes will ever be required, and it may be possible to maintain backwards compatibility. (As a case in point, support for hardware ECC is currently work-in-progress within eCosCentric, and does require such a change, but now is not the right time to discuss that.) It would perhaps be interesting to compare the complexities of drivers for the two models, but it's not readily apparent how we would do that fairly. Perhaps porting a driver from one NAND layer to the other would be a useful exercise, and would also allow us to compare code sizes. Any suggestions or (he says hopefully) volunteers? I've got a lot on my plate this month... 4. Feature/implementation differences ------------------------------------ (I don't consider these to be significant issues; whilst noteworthy, I don't think they would take much effort to resolve.) (a) Documentation The two layers' documentation differ in their depth and layout; these are difficult for me to compare objectively, and I would suggest that a fresh pair of eyes compare them. I can only offer the comment that I documented the E layer bearing in mind what I considered to be missing from the R layer documentation: it was not clear how the controller and chip layers inter-related, nor where to start in creating a driver. (I also had a lot less experience of NAND chips then than I do now, and what I need to know now is different from what a newbie would.) (b) Availability of drivers R provides support for: - One board: BlackFin EZ-Kit BF548 (which is not in anoncvs?) - One chip: the ST Micro 0xG chip (large page, x8 and x16 present but presumably only tested on the x8 chip on the BlackFin board?) - A synthetic controller/chip package - A template for a GPIO-based controller (untested, intended as an example only) I seem to remember rumours of the existence of a driver for a further chip+board combination, but I haven't seen it. E provides support for: - Two boards: Embedded Artists LPC2468 (very well tested); STM3210E (largely complete, based on work by Simon K; some enhancements planned) - Two chips: Samsung K9 family (large page, only x8 done so far); ST-Micro NANDxxxx3A (small page, x8) (based on work by Simon K) - Synthetic target. This offers more features than R's: bad block injection, logging, and a GUI interface via the synth I/O auxiliary. - Further (customer-confidential) board ports. (c) RedBoot support E have added some commands for NAND operations and tested on the EA LPC2468 board. (YAFFS support works via the existing RB fileio layer; nothing really needed to be done.) (d) Degree of testing There are presumably differences of coverage here; both E and R assert they have carried out stress tests. Properly comparing the depth of the two would be a job for fresh eyes. E have: - a handful of unit and functional tests of the NAND layer, and a benchmarker - a number of YAFFS functional tests, one of which includes benchmarking, and a further severe YAFFS stress test: these indirectly test the NAND layer. (The latter has been run under the synth driver with bad-block injection turned on, and has revealed some subtle bugs which we probably wouldn't otherwise have caught.) - the ability to run continual test cycles in their test farm 5. Works in progress ----------------------------------------------------- I can of course only comment on eCosCentric's plans, but the following work is in the pipeline: * Expansion of the device interface to better allow efficient hardware ECC support (in progress) * Hardware ECC for the STM3210E board driver * Performance tuning of software ECC and of NAND low-level drivers * Partition addressing: make addressing relative to the start of the partition, once and for all * Simple raw NAND "filesystem" for use by RedBoot (see http://ecos.sourceware.org/ml/ecos-devel/2009-07/msg00004.html et seq; those are the latest public mails but not the latest version of my thinking, which I will update in due course) * More RedBoot NAND utility commands * Support for booting Linux off NAND and for sharing a (YAFFS) NAND-resident filesystem * Part-page read support (would provide a big speed-up to parts of YAFFS2 inbandTags mode as needed by small-page devices like that on the STM3210E) -------------------------------------------------------------------------- Ross -- Embedded Software Engineer, eCosCentric Limited. Barnwell House, Barnwell Drive, Cambridge CB5 8UU, UK. Registered in England no. 4422071. www.ecoscentric.com