public inbox for ecos-devel@sourceware.org
 help / color / mirror / Atom feed
* contributing filesystem and a failsafe update meachanism for FIS from within ecos applications
@ 2005-09-23 12:09 Neundorf, Alexander
  2005-09-23 13:02 ` Jon Ringle
  0 siblings, 1 reply; 6+ messages in thread
From: Neundorf, Alexander @ 2005-09-23 12:09 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: ecos-devel

[-- Attachment #1: Type: text/plain, Size: 7403 bytes --]

Hi,

it's not even one year ago, and already I have the next version of my patch available ;-)

The attached patch implements a read-only filesystem for FIS, and three extra utility functions for manipulating it safely from ecos applications.

We need to be able to perform safe updates of the firmware, safe regarding power loss at any point in time. Since redboot comes with FIS, we'd like to use fis.
In order to update the firmware a new firmware image has to be placed on the flash and the fis directory has to be updated. When updating the fis directory, the directory is erased and afterwards written with the new contents.
Now if the power goes down directly after erasing the directory redboot can't start the firmware image anymore since it can't read the directory.

In order to enable failsafe operation of redboot and fis under such circumstances, a backup of the fis directory has to be kept until the new directory has been written successfully.
Here comes my proposed strategy:
Currently the fis directory occupies one block of the flash. For safe operation it needs a second redundant block. Both blocks contain the fis directory, but only one is valid (and current).
Redboot needs a way to determine which block contains the valid information.
For this and to stay compatible with existing flash, I suggest to use the first entry of the fis directory table as a valid marker, which can be used to decide which of the two blocks is valid.
It looks like this:

#ifdef CYGOPT_REDBOOT_REDUNDANT_FIS
#define CYG_REDBOOT_RFIS_VALID_MAGIC_LENGTH 10
#define CYG_REDBOOT_RFIS_VALID_MAGIC ".FisValid"  //exactly 10 bytes

#define CYG_REDBOOT_RFIS_VALID       (0xa5)
#define CYG_REDBOOT_RFIS_IN_PROGRESS (0xfd)
#define CYG_REDBOOT_RFIS_EMPTY       (0xff)

struct fis_valid_info
{
   char magic_name[CYG_REDBOOT_RFIS_VALID_MAGIC_LENGTH];
   unsigned char valid_flag[2]; //this should be safe for all alignment issues
   unsigned long version_count;
};
#endif // CYGOPT_REDBOOT_REDUNDANT_FIS


The name is a special name ".FisValid", followed by the actual valid_flag which signals the validity of this FIS table. This way the FIS table stays compatible with the other algorithms in redboot.
To find out the valid FIS table, the name of the first entry is checked against ".FisValid". If it matches valid_flag is checked. The table is only valid, if valid_flag== 0xa5a5. If this is true for both FIS tables, the current and the redundant one, version_count is compared. Then the FIS table with the bigger version_count becomes the valid FIS table.

When performing a safe update, the algorithm must do the following:
(after the * followes what happens when the power goes down at this point in time)

1. modify the fis directory (in RAM) so that it reflects the desired changes, set the valid_flag to RFIS_IN_PROGRESS and set version_count=version_count+1;
*nothing has changed yet, so redboot will work as before

2. erase the flash where the currently invalid fis directory is located
*the valid_flag of the fis directory which will become the new valid directory is 0xffff, and the valid flag of the currently still active directory is still 0xa5a5, and the images haven't been touched yet, so still everything ok for redboot

3. write the modified fis directory in this erased flash block. In redboot/flash.c: fis_start_update_directory()
*as above, but the valid_flag of the directory which is intended to become valid is now 0xfdfd. The images still haven't been touched, so everything is ok.

4. modify the flash image (erase, program)
*now the image has been modified. If you erase the only runnable firmware image on the flash you are of course lost, just avoid this. In all other cases, there is still a working fis directory and a working firmware image on the flash. The old current fis directory is still valid, and the currently running firmware image hasn't been touched. By checking the crc's of the images later you can detect which images are broken.

5. after the image is written, set the valid_flag of the fis directory which will become active to 0xa5a5. In order to do this, the flash block doesn't have to be erased, since the transition from 0xfdfd to 0xa5a5 only sets some bits to 0. When this is done, the image has been written correctly and the new fis directory has the right magic_name, the right valid_flag and its version_count is higher than the version_count of the old fis directory. In redboot/flash.c:  fis_update_directory()
*if the power goes down while writing the 4 bytes of the valid_flag, either the valid_flag has already reached 0xa5a5, then everything is ok, if not it will have a valid_flag != 0xa5a5 and thus not be considered valid.

The attached patch implements support for this strategy in redboot. It basically reads the first entry of both fis blocks, checks them and sets one to be the valid one. The fis manipulation functions in redboot have been modified to support this style of operation. This "safe" FIS can be enabled via the option CYGOPT_REDBOOT_REDUNDANT_FIS.

To make the update functionality availabe to ecos applications a new virtual vector call had to be added, since flash_fis_op() can't list the existing images, it can only return information for an image if you already know its name. The new VV call has the following subfunctions:

* CYGNUM_CALL_IF_FLASH_FIS_GET_VERSION: for checking the compatibility between redboot VV interface and the application

* CYGNUM_CALL_IF_FLASH_FIS_INIT: read the FIS table and find the valid one

* CYGNUM_CALL_IF_FLASH_FIS_GET_ENTRY_COUNT: get the maximum number of entries the FIS table can have

* CYGNUM_CALL_IF_FLASH_FIS_GET_ENTRY: return the information for one FIS table entry by its index. This uses a binary struct, which isn't identic to struct fis_image_desc, but contains most of its information. 

* CYGNUM_CALL_IF_FLASH_FIS_MODIFY_ENTRY: puts the parameters given for an image in the specified entry of the FIS table (in RAM). If you have done this for the image you want to modify, call FIS_START_UPDATE, then update the image and finally call FIS_FINISH_UPDATE

* CYGNUM_CALL_IF_FLASH_FIS_START_UPDATE: start updating the FIS table. Has to be called before writing the image on the flash. Without redundant FIS this does nothing. With redundant FIS it does what is described in step 3) above.

* CYGNUM_CALL_IF_FLASH_FIS_FINISH_UPDATE: finish updating the FIS table. Has to be called after writing the image to the flash successfully. Without redundant FIS it simply writes the new FIS table, with redundant FIS it just marks the already written new table as valid.

For the user there are three functions fis_get_entry(), fis_remove_image() and fis_create_image() available, which call these VVs appropriately. fis_create_image() currently takes a pointer to the whole data buffer and writes it as image on the flash. This might not work for devices which don't have so much RAM. But since this is implemented in the application, it should not be too hard for somebody who needs this functionality to extend the functionality accordingly.

We use this update mechanism now for approx. one year and it has never failed. So I think it would be a good contribution to eCos.

Additionally a read-only file system for FIS is implemented in the attached patch.

So what do you think ?

Bye
Alex

[-- Attachment #2: ecos.fisfs.patch.gz --]
[-- Type: application/x-gzip, Size: 18239 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: contributing filesystem and a failsafe update meachanism for FIS from within ecos applications
  2005-09-23 12:09 contributing filesystem and a failsafe update meachanism for FIS from within ecos applications Neundorf, Alexander
@ 2005-09-23 13:02 ` Jon Ringle
  2005-09-23 13:33   ` Gary Thomas
  0 siblings, 1 reply; 6+ messages in thread
From: Jon Ringle @ 2005-09-23 13:02 UTC (permalink / raw)
  To: ecos-devel

On Friday 23 September 2005 08:09 am, Neundorf, Alexander wrote:
> 5. after the image is written, set the valid_flag of the fis directory
> which will become active to 0xa5a5. In order to do this, the flash block
> doesn't have to be erased, since the transition from 0xfdfd to 0xa5a5 only
> sets some bits to 0.

I didn't know this property of flash. Is this a universal property of NOR 
flash?

Jon

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: contributing filesystem and a failsafe update meachanism for FIS from within ecos applications
  2005-09-23 13:02 ` Jon Ringle
@ 2005-09-23 13:33   ` Gary Thomas
  2005-09-23 18:06     ` Tarmo Pikaro
  0 siblings, 1 reply; 6+ messages in thread
From: Gary Thomas @ 2005-09-23 13:33 UTC (permalink / raw)
  To: ml.ecos; +Cc: eCos development

On Fri, 2005-09-23 at 09:01 -0400, Jon Ringle wrote:
> On Friday 23 September 2005 08:09 am, Neundorf, Alexander wrote:
> > 5. after the image is written, set the valid_flag of the fis directory
> > which will become active to 0xa5a5. In order to do this, the flash block
> > doesn't have to be erased, since the transition from 0xfdfd to 0xa5a5 only
> > sets some bits to 0.
> 
> I didn't know this property of flash. Is this a universal property of NOR 
> flash?

Yes. Erase operations reset all the bits to one.  Programming can
only change a one to a zero.  This can be done on a bit by bit 
basis, thus you can update a block by changing some bits to zeroes
without having to erase it.

-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: contributing filesystem and a failsafe update meachanism for FIS from within ecos applications
  2005-09-23 13:33   ` Gary Thomas
@ 2005-09-23 18:06     ` Tarmo Pikaro
  0 siblings, 0 replies; 6+ messages in thread
From: Tarmo Pikaro @ 2005-09-23 18:06 UTC (permalink / raw)
  To: Gary Thomas, ml.ecos; +Cc: eCos development

Hi !

Hmmm... I'm kinda involved with firmware updating,
just a curious to hear that someone takes a closer
look at fail-safeness of update. Can you send me all
the related documentation and source code ?


> Yes. Erase operations reset all the bits to one. 
> Programming can
> only change a one to a zero.  This can be done on a
> bit by bit 
> basis, thus you can update a block by changing some
> bits to zeroes
> without having to erase it.
> 
> -- 
>
------------------------------------------------------------
> Gary Thomas                 |  Consulting for the
> MLB Associates              |    Embedded world
>
------------------------------------------------------------
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: contributing filesystem and a failsafe update meachanism for FIS from within ecos applications
  2005-10-06  6:45 AW: " Neundorf, Alexander
@ 2005-10-06  7:22 ` Andrew Lunn
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Lunn @ 2005-10-06  7:22 UTC (permalink / raw)
  To: Neundorf, Alexander; +Cc: ecos-devel

On Thu, Oct 06, 2005 at 08:44:58AM +0200, Neundorf, Alexander wrote:
> 
> 
> > Von: ecos-devel-owner@ecos.sourceware.org
> > 
> > Hi,
> > 
> > it's not even one year ago, and already I have the next 
> > version of my patch available ;-)
> > 
> > The attached patch implements a read-only filesystem for FIS, 
> > and three extra utility functions for manipulating it safely 
> > from ecos applications.
> 
> Any comments ?

The same onces as last year.

You should also take a look at locking. You have some locking between
the FS and the functions, but it has race conditions. The locking the
other way does not exist at all.

You also incorrectly use st_mode, but that is minor.

        Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: contributing filesystem and a failsafe update meachanism for FIS from within ecos applications
@ 2005-09-23 13:11 Daly, Jeffrey
  0 siblings, 0 replies; 6+ messages in thread
From: Daly, Jeffrey @ 2005-09-23 13:11 UTC (permalink / raw)
  To: ml.ecos, ecos-devel

Yep.

>-----Original Message-----
>From: ecos-devel-owner@ecos.sourceware.org [mailto:ecos-devel-
>owner@ecos.sourceware.org] On Behalf Of Jon Ringle
>Sent: Friday, September 23, 2005 9:01 AM
>To: ecos-devel@ecos.sourceware.org
>Subject: Re: contributing filesystem and a failsafe update meachanism
for
>FIS from within ecos applications
>
>On Friday 23 September 2005 08:09 am, Neundorf, Alexander wrote:
>> 5. after the image is written, set the valid_flag of the fis
directory
>> which will become active to 0xa5a5. In order to do this, the flash
block
>> doesn't have to be erased, since the transition from 0xfdfd to 0xa5a5
>only
>> sets some bits to 0.
>
>I didn't know this property of flash. Is this a universal property of
NOR
>flash?
>
>Jon

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-10-06  7:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-23 12:09 contributing filesystem and a failsafe update meachanism for FIS from within ecos applications Neundorf, Alexander
2005-09-23 13:02 ` Jon Ringle
2005-09-23 13:33   ` Gary Thomas
2005-09-23 18:06     ` Tarmo Pikaro
2005-09-23 13:11 Daly, Jeffrey
2005-10-06  6:45 AW: " Neundorf, Alexander
2005-10-06  7:22 ` Andrew Lunn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).