From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24384 invoked by alias); 8 Oct 2009 08:16:13 -0000 Received: (qmail 24375 invoked by uid 22791); 8 Oct 2009 08:16:12 -0000 X-SWARE-Spam-Status: No, hits=-1.2 required=5.0 tests=AWL,BAYES_00,SPF_HELO_SOFTFAIL X-Spam-Check-By: sourceware.org Received: from ip2.televic.com (HELO lx-dmz.televic.com) (81.82.194.222) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 08 Oct 2009 08:16:07 +0000 Received: (qmail 23208 invoked from network); 8 Oct 2009 08:16:04 -0000 Received: from srv-vs06.televic.com (10.0.0.46) by lx-dmz.televic.com with (RC4-MD5 encrypted) SMTP; 8 Oct 2009 08:16:04 -0000 Received: from [127.0.0.1] (10.0.56.4) by SRV-VS06.TELEVIC.COM (10.0.0.46) with Microsoft SMTP Server id 8.1.291.1; Thu, 8 Oct 2009 10:16:24 +0200 Message-ID: <4ACD9FC3.1030508@televic.com> Date: Thu, 08 Oct 2009 08:16:00 -0000 From: =?ISO-8859-1?Q?J=FCrgen_Lambrecht?= User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: eCos developers Subject: Re: Re: NAND technical review References: <4ACB4B58.2040804@ecoscentric.com> In-Reply-To: <4ACB4B58.2040804@ecoscentric.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 8bit Mailing-List: contact ecos-devel-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: ecos-devel-owner@ecos.sourceware.org X-SW-Source: 2009-10/txt/msg00008.txt.bz2 Just some explanatory remarks below, hardware related. Ross Younger wrote: > 1. NAND 101 ------------------------------------------------------------- > > (Those familiar with NAND chips can skip this section, but I appreciate > that not everybody on-list is in the business of writing NAND device > drivers :-) ) > > (i) Conceptual > > > Now, I mentioned ECC data. NAND technology has a number of underlying > limitations, importantly that it has reliability issues. I don't have a full > picture - the manufacturers seem to be understandably coy - but my > understanding is that on each page, a driver ought to be able to cope with a > single bit having flipped either on programming or on reading. The > Such a "broken bit" is because the transistor that contains the bit is physically broken, and is stuck at 1 or at 0 (I don't know if it can be both). So you cannot anymore erase it (flip it back to 1) or program it (flip to 0). I thought only programming or erasing could break it, not reading? Is somebody sure about this? > recommended way to achieve this is by storing an ECC in the spare area: the > algorithm published by Samsung is popular, requiring 22 bits of ECC per 256 > bytes of data and able to correct a 1 bit error and detect a 2 bit error. > > There is also the question of bad blocks. Again, full details are sketchy. A > chip may be shipped with a number of "factory-bad" blocks (e.g. up to 20 on > this Samsung chip); they are marked as such in their spare area. (What > constitutes a "bad" block is not published; one imagines that the factory > have access to more test information than users do and that there may be > statistical techniques involved in judging the likely reliability of the > block.) Blocks may also fail during the life of the device, usually by the > NAND flash chips are very dense chips (many bits on a small size) and there is a trade-off in manufacturing between reliablility and density. To make them dense (hence cheap) faults have to be tolerated. The manufacturer just tries to program all bits a first time to check for manufacturing errors. When a broken bit is discovered, the entire block is marked bad. > chip reporting a failure during a program or erase operation. Because of > this, the manufacturers recommend that chip drivers scan the device for > factory-bad markers then create and maintain a Bad Block Table throughout > the life of of the device. How this is done is not prescribed, but the > behaviour of the Linux MTD layer is something approximating a de facto standard. > > (iii) Electrical > > Most, if not all, NAND chips have the same broad electrical interface. > > There is a master Chip Enable line; nothing happens if this is not active. > (below a hardware designer note :-) Be carefull on this: a standard chip enable is only active during the actual read or write. But an access to a NAND flash is a complete cycle during which the NAND flash embedded control logic needs to keep its state! Therefore, the Chip Enable (or Chip Select) of the NAND flash is (on my ARM9 anyhow) connected to a GPIO pin (general-purpose input/output pin). Therefore the SW has to assert this pin at the start of an access and de-assert it at the end. The read hardware Chip Select pin is not connected. (In R's SW in the io/flash_nand/../controller: cyg_nand_ctl_chip_select, that calls chip_select implemented in the board-specific driver in /devs/flash/[uC brand]) > Data flows into and out of the chip via its data bus, which is 8 or 16 bits > wide, mediated by Read Enable and Write Enable lines. > > Commands and addresses are sent on the data bus, but routed to the > appropriate latches by asserting the Address Latch Enable or Command Latch > Enable lines at the same time. > > There is also a ready/busy line which the driver can use to tell when an > operation is in progress. Typical operation times from the Samsung spec > sheet I have to hand are 25us for a page read, 300us for a page program, and > 2ms for a block erase. > > > (iv) Board hook-up > > Sometimes the ready/busy line isn't wired in or requires a jumper to be set > to route it. This can be worked around: for a read operation, one can just > insert a delay loop for the prescribed maximum time, while for programs and > erases, most (all?) chips have a "Read Status" command which can be used to > query whether the operation has completed. > We started our driver this way > It can be beneficial to be able to set up the ready/busy line as an > interrupt source, as opposed to having to poll it. Whilst there is an > overhead involved in context-switching, if other application threads have > much to do it may be advantageous overall for the thread waiting for the > NAND to sleep until woken by interrupt. > To speed up, now we poll the ready/busy. To use it as interrupt is still todo. > Of course, it is possible to put multiple chips on a board. In that case > there needs to be a way to route between them; I would expect this to be > done with the Chip Select line, addressed either by different MMIO addresses > or a separate GPIO or CPLD step. Theoretically, multiple chips could be > hooked up in parallel to give something that looks like a 16 or 32-bit > "wide" chip, but I have never encountered this in the NAND world, and it > would impose a certain extra level of complexity on the driver. > Indeed, this would be difficult: a NAND is not a simple memory mapped device as a NOR flash or SRAM, easy to put in parallel. Only because of bad block management, putting them in parallel is difficult: they cannot be put parallel in hardware, they need to be addresses separately. Then they must be made parallel virtually in software. Regards, Jürgen