From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Woodhouse To: ecos-discuss@sources.redhat.com Subject: [ECOS] Re: Fw: [ECOS] Re: Simple flash filesystem? (fwd) Date: Wed, 07 Feb 2001 04:14:00 -0000 Message-id: <7895.981548077@redhat.com> X-SW-Source: 2001-02/msg00095.html Eep. Paying attention where I send it this time... -- dwmw2 ------- Forwarded Message From: David Woodhouse To: Kristian Otnes Cc: "Paul Beskeen" , ecos-maintainers@redhat.com Subject: Re: Fw: [ECOS] Re: Simple flash filesystem? Date: Wed, 07 Feb 2001 11:22:50 +0000 kristian.otnes@tevero.no said: > the reason for emulating a disk in a flash based filesystem is > probably twofold: > - It fits in with the normal disk approach usage > - It breaks the larger blocks (typically 64KB or 128KB) into > virtual smaller blocks, so that other software is not > bothered by the problem of handling the large flash blocks > efficiently. In other words, it is a relatively simple > way of managing some of the harder parts of flash usage. Both of those are of an issue for people dealing with legacy operating systems who are stuck with the existing block-based filesystem concept. I can understand doing this under DOS where you provide an INT13h handler for your device to make it pretend to be a normal disc drive, and you don't want to get any more involved with the O/S than you have to. Under real operating systems these days though, it's not really an issue. When you emulate a 'normal' block device on flash, you basically end up with a kind of pseudo-filesystem to keep track of where the blocks are, etc. Obviously you need that to be a journalling pseudo-filesystem of some kind, to prevent corruption. On top of that emulated block device, you then need to put a 'normal' journalling filesystem. You've got two layers of filesystem and two layers of journalling. It's not wonderfully efficient. I spent a long time dreaming of a filesystem which worked directly on flash chips without this problem. Eventually, the guys at Axis wrote it - JFFS runs directly on the flash chips. It's a log-structured filesystem. You just write nodes sequentially to the flash. Each node contains the current metadata for the file you're writing, including stuff like name and parent inode number so the directory tree can be built, and usually some data for a portion of that file. There's no wasted space, because each node comes immediately after the previous node. (Well, we do align them to 4 bytes.) The filesystem keeps a map of which bits of each file can be found at what location on the flash, and when you read from the file, it just copies the data out of the right node on the flash for you. The interesting bit is when you get to the end of the flash chip(s) - you have to start again at the beginning. Generally, some of the nodes you wrote out right at the beginning have been obsoleted by later writes to the same offset in the same file. So taking each erase block one at a time from the beginning again, you copy the nodes that are still valid into the space you've got left, and then delete the erase block. Generally, you've made yourself some more space by doing that. JFFS has some problems - it uses quite a lot of RAM because it keeps a complete 'map' for each file in-core at all times, and it will garbage-collect erase blocks strictly in order even if some of them don't actually _have_ any obsoleted nodes so it's just moving megabytes of data from one location on the flash to another. We get _perfect_ wear _levelling_, but it's hardly optimal. I'm currently working on a re-implementation of JFFS; imaginatively called JFFS2. It extends the excellent ideas of Axis' original and fixes these problems, along with adding support for hard links and compression. It's turning out to re-use almost no code from the original (GPL'd) version. So the prospects for an eCos port look fairly good. Any volunteers to work on an eCos version once the Linux code has stabilised a little and at least _compiles_ would be welcome :) - -- dwmw2 ------- End of Forwarded Message