From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ecos-devel-return-1463-listarch-ecos-devel=sources.redhat.com@ecos.sourceware.org>
Received: (qmail 9293 invoked by alias); 20 May 2009 13:24:48 -0000
Received: (qmail 9269 invoked by uid 22791); 20 May 2009 13:24:46 -0000
X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 	tests=AWL,BAYES_00,SPF_PASS
X-Spam-Check-By: sourceware.org
Received: from hagrid.ecoscentric.com (HELO mail.ecoscentric.com) (212.13.207.197)     by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 20 May 2009 13:24:38 +0000
Received: from localhost (hagrid.ecoscentric.com [127.0.0.1]) 	by mail.ecoscentric.com (Postfix) with ESMTP id 6FFDF3B4004A; 	Wed, 20 May 2009 14:24:36 +0100 (BST)
Received: from mail.ecoscentric.com ([127.0.0.1]) 	by localhost (hagrid.ecoscentric.com [127.0.0.1]) (amavisd-new, port 10024) 	with ESMTP id zhkVG049gQ85; Wed, 20 May 2009 14:24:35 +0100 (BST)
Date: Wed, 20 May 2009 13:24:00 -0000
Message-Id: <pnd4a3aovi.fsf@delenn.bartv.net>
From: Bart Veer <bartv@ecoscentric.com>
To: Andrew Lunn <andrew@lunn.ch>
CC: wry@ecoscentric.com, simon.kallweit@intefo.ch, 	ecos-devel@ecos.sourceware.org
In-reply-to: <20090519141710.GJ20046@lunn.ch> (message from Andrew Lunn on 	Tue, 19 May 2009 16:17:10 +0200)
Subject: Re: NAND review
References: <4A126D59.7070404@intefo.ch> <4A12B877.9030404@ecoscentric.com> <20090519141710.GJ20046@lunn.ch>
Mailing-List: contact ecos-devel-help@ecos.sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <ecos-devel.ecos.sourceware.org>
List-Subscribe: <mailto:ecos-devel-subscribe@ecos.sourceware.org>
List-Post: <mailto:ecos-devel@ecos.sourceware.org>
List-Help: <mailto:ecos-devel-help@ecos.sourceware.org>, <http://ecos.sourceware.org/lists.html#faqs>
Sender: ecos-devel-owner@ecos.sourceware.org
X-SW-Source: 2009-05/txt/msg00043.txt.bz2

>>>>> "Andrew" == Andrew Lunn <andrew@lunn.ch> writes:

    >> The partition definition is necessarily platform-specific, so
    >> doesn't fit anywhere else.

    Andrew> I simply don't get this. 

    Andrew> Take a case i recently had with a NOR based system. I had
    Andrew> a linux kernel bug i had to trace down. So that i had
    Andrew> human readable kernel opps information, i rebuilt the
    Andrew> kernel to include debug symbols. The resulting kernel was
    Andrew> too big to fit in the space allocated to it. So i used
    Andrew> redboot fis to zap both the root filesystem and the space
    Andrew> holding the kernel. I recreated the kernel partition a bit
    Andrew> bigger and made the root filesystem a bit smaller. I then
    Andrew> installed the new kernel and the root filesystem. I then
    Andrew> had a booting system with opps with symbols, not hex
    Andrew> addresses.

    Andrew> At no point did i need to edit the HAL, rebuild and
    Andrew> install a new redboot. Why should NAND be different? Why
    Andrew> cannot this partition information be configured by
    Andrew> redboot? Why must it be platform specific?

I am not a NAND expert, but I think the answer is that NAND is
fundamentally different from NOR: it is unreliable.

With NOR flash we can store partition table info (i.e. FIS) in a
well-known location. RedBoot can write to and read from that location
and it is pretty much guaranteed to work. Hence the only detail that
needs to be encoded in the RedBoot executable is that location - which
is configurable although nearly all systems will use the default.
Everything else can be determined and changed at run-time.

With NAND flash we do not have the guarantee from the hardware that a
well-known location will always work. In fact Sod's law will ensure
that that location will be one of the first to have errors.

Some systems will have reliable persistent storage in addition to the
NAND, e.g. an additional bank of NOR flash. On such systems we could
store the NAND partition table in the NOR flash, using similar code
to FIS. But we cannot assume that all systems will be like that, and I
do not like the idea of having two different partitioning schemes
depending on the hardware capabilities.

Or, we could try to implement a robust layer on top of the basic NAND
layer. If you want to store N pages reliably, reserve (N+f(N)+k)
pages in the NAND flash. When one of the pages starts failing, start
using one of the spare pages. Assume that updates will be infrequent
so that you do not need to worry about wear-levelling, just bad
blocks. This would give us a way of storing the partition info in the
NAND flash with a high degree of reliability. Unfortunately it means a
fairly complicated extra layer which will not be needed by e.g.
NAND-aware flash filesystems - those expect to handle the bad pages
etc. themselves.

So, if you cannot store the partition info in the NAND chip itself
without bloat, and you may not have any other persistent storage in
the system, you are left with two choices: don't bother with partition
info at all, or embed the partition info in the code.

Ignoring partitioning info completely in the NAND layer, and instead
leaving it to higher-level code, is somewhat tempting. But it does not
actually solve anything. Consider a chip with a built-in bootloader
which can read in a secondary bootloader from NAND flash. That
built-in bootloader will impose certain restrictions on the NAND
flash, e.g. it will reserve n pages at the start of the flash. The
partitioning code has to be aware of that. But you really do not want
higher-level partitioning code in both YAFFS and UFFS having to cope
with hardware-specific details like that. Doing it in a single NAND
layer with close ties to the platform HAL seems better.

So, what you are left with is embedding the partition info in the
code. We really do not want to completely hardwire it for a given
system, so using configury is the next best thing. It is not as
flexible as FIS for NOR flash, but it will be adequate for most users.

Now, there are ways in which we should be able to avoid duplicating
lots of CDL for most platforms. We can have a default partitioning
scheme in the generic NAND code, active_if some interface. If a
platform can use that default scheme it can just implement the
interface, possibly in conjunction with some default settings and
requires properties. If a platform really needs to do something
special it can ignore the generic support and define its own
partitioning CDL, taking into account whatever weird constraints the
hardware has. It may not be possible to figure out exactly how to do
that just yet, and it may need to wait until NAND support is working
on a few more platforms.

Bart