From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ecos-devel-return-1660-listarch-ecos-devel=sources.redhat.com@ecos.sourceware.org>
Received: (qmail 11224 invoked by alias); 16 Oct 2009 04:04:18 -0000
Received: (qmail 10742 invoked by uid 22791); 16 Oct 2009 04:04:15 -0000
X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 	tests=AWL,BAYES_00
X-Spam-Check-By: sourceware.org
Received: from virtual.bogons.net (HELO virtual.bogons.net) (193.178.223.136)     by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 16 Oct 2009 04:04:10 +0000
Received: from jifvik.dyndns.org (jifvik.dyndns.org [85.158.45.40]) 	by virtual.bogons.net (8.10.2+Sun/8.11.2) with ESMTP id n9G446424359; 	Fri, 16 Oct 2009 05:04:06 +0100 (BST)
Received: from [172.31.1.126] (neelix.jifvik.org [172.31.1.126]) 	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) 	(No client certificate requested) 	by jifvik.dyndns.org (Postfix) with ESMTP id 50DF93FEB; 	Fri, 16 Oct 2009 05:04:06 +0100 (BST)
Message-ID: <4AD7F0B5.9050101@jifvik.org>
Date: Fri, 16 Oct 2009 04:04:00 -0000
From: Jonathan Larmour <jifl@jifvik.org>
User-Agent: Mozilla Thunderbird 1.0.8-1.1.fc4 (X11/20060501)
MIME-Version: 1.0
To: Rutger Hofman <rutger@cs.vu.nl>
Cc: Ross Younger <wry@ecoscentric.com>, ecos-devel@ecos.sourceware.org
Subject: Re: NAND technical review
References: <4AC6218C.20407@jifvik.org> <4ACB4B58.2040804@ecoscentric.com> <4ACC0722.9020601@jifvik.org> <4ACDF868.7050706@ecoscentric.com> <4ACEF3D1.1090609@ecoscentric.com> <4AD3E412.80002@jifvik.org> <4AD48367.8050807@cs.vu.nl>
In-Reply-To: <4AD48367.8050807@cs.vu.nl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Mailing-List: contact ecos-devel-help@ecos.sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <ecos-devel.ecos.sourceware.org>
List-Subscribe: <mailto:ecos-devel-subscribe@ecos.sourceware.org>
List-Post: <mailto:ecos-devel@ecos.sourceware.org>
List-Help: <mailto:ecos-devel-help@ecos.sourceware.org>, <http://ecos.sourceware.org/lists.html#faqs>
Sender: ecos-devel-owner@ecos.sourceware.org
X-SW-Source: 2009-10/txt/msg00029.txt.bz2

Rutger Hofman wrote:
> Jonathan Larmour wrote:
> 
>> Hmm, I guess the key thing here is that in E's implementation most of 
>> the complexity has been pushed into the lower layers; at least 
>> compared to R's. R's has a more consistent interface through the 
>> layers. Albeit at the expense of some rigidity and noticeable function 
>> overhead.
>>
>> It's not likely E's will be able to easily share controller code, 
>> given of course you don't know what chips, and so what chip driver 
>> APIs they'll be connected to. But OTOH, maybe this isn't a big deal 
>> since a lot of the controller-specific munging is likely to be 
>> platform-specific anyway due to characteristics of the attached NAND 
>> (e.g. timings etc.) and the only bits that would be sensibly shared 
>> would potentially happen in the processor HAL anyway at startup time. 
>> What's left may not be that much and isn't a problem in the platform 
>> HAL. However the likely exception to that is hardware-assisted ECC. A 
>> semi-formal API for that would be desirable.
> 
> 
> This is the largest difference in design philosophy between E and R. Is 
> it OK if I expand?

Sure.

> NAND chips are all identical in their wire setup. They all have a data 
> 'bus', and control lines to indicate whether what is on the bus is a 
> command, an address, or data.
> 
> NAND chips differ in how their command language works, but only so far. 
> What is on the market now is 'regular' large-page chips that all speak 
> the same command language, and small-page chips that have a somewhat 
> different command language. ONFI chips are large-page chips except in 
> interrogation at startup and in bad-block marking.

As I've already noted, it may be useful to think ahead to what may come 
into the market later, including things that don't fit into the known 
command languages (such as existing OneNAND) - a framework which can 
support wider implementations can have that advantage.

[snip example]
> These 2 languages are all the variation there is for NAND chips (plus, 
> at another level, 2 timing values for read cycle and write cycle)! The 
> wide-ranging differences for devices for NAND are in the controllers.
> 
> How controllers work, is that they accept input like 'write a command of 
> value 0x..', 'write an address of value 0x.....', etc, and do their job 
> on the NAND chip's wires. They cannot really operate at a higher level, 
> if only because they must support both small-page and large-page chips 
> (and ONFI), and this is the level of common protocol for the chips.
> 
> So controller code has to bridge between API calls like page_read and 
> the interface of the controller as described above. R's implementation 
> presumes that a lot of the code to make this translation is generic: a 
> large-page read translates to the controller steps as given above in the 
> running example, in any controller implementation.

That's true. At the same time, have a look at E's code in 
https://bugzilla.ecoscentric.com/show_bug.cgi?id=1000770
Specifically the Samsung K9 driver in 
devs/nand/samsung_k9/d20090826/include/k9fxx08x0x.inl - while you could 
argue the steps required are generic and can be made common (write this 
address, write that command, etc.), it seems E assumes that the steps may 
not really be complex enough to justify abstracting them out.

I would certainly be interested in your perspective about what E's driver 
implementation lacks compared to R's. Lack of hardware ECC is one thing 
certainly.

> Moreover, the generic 
> code handles spare layout: where in the spare is the application's spare 
> data folded, where is the ECC, where is the bad-block mark. 

In E's implementation, the complexities of an abstracted spare layout seem 
to start disappearing as you know more about what chip you've got as a lot 
of the complexity has been pushed into the chip driver.

> OTOH, the 
> generic code has hooks for handling any ECC that the controller has 
> computed in hardware -- how ECC is supported in hardware varies across 
> controllers. But the way the ECC check is handled (case in point is 
> where a correctible bit error is flagged) is generic again.

In E's case, in the EA LPC2468 port example, they have the following in 
the platform HAL for a port (although it could be a package instead):

[various functions/macros defined which are used by k9fxx08x0x.inl]
#include <cyg/devs/nand/k9fxx08x0x.inl>
CYG_NAND_DEVICE(ea_nand, "onboard", &k9f8_funs, &_k9_ea_lpc2468_priv,
                 &linux_mtd_ecc, &nand_mtd_oob_64);

which succinctly brings together the chip driver, accessor functions, ECC 
algorithm, and OOB layout. It becomes easy for a board port to choose some 
different chips/layouts/ECC. There's flexibility for the future in that.

With R's implementation, there seems to be much more code involved. And I 
sort of see why there's more code, and I sort of don't. Not just in the 
generic layer, but in the drivers as well, at least looking at the bfin 
chip, and I don't think the differences are completely explained by the 
hardware properties of each NFC (but I'm very willing to be corrected!). 
Comparing E's k9_read_page() along with everything it calls, with R's 
bfin_nfc_data_read() along with everything it calls (and those call etc. 
not just in bfin_nfc.c but also nand_ez_kit_bf548.inc[1]) there's a huge 
difference. If nothing else from what I can tell this may then require a 
much larger porting effort, compared to E's.

I see that some of the reasons for larger code in R are due to run-time 
testing of hardware properties: 8 vs 16-bit bus width, SP vs LP vs ONFI. I 
also note that E's implementation doesn't do as much error checking as I 
think it ought to, especially in the Samsung K9 chip driver. But that's 
not all of it the difference.

Anyway, I think I'm talking out loud here rather than asking anything 
specific about it. It may just be something we have to put down to the 
difference in design philosophy, rather than something which can be 
improved. There are still advantages with R in other ways.

Jifl

[1] which should really be .inl for consistency in eCos but that's a detail
-- 
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine