Recover Flash Devices when Power Failure or Reset Happens During a Program or Erase Operation - KBA218817

Version: **

 

Question:

How do I recover Cypress Flash devices when a sudden power down or flash reset happens during a program or erase operation? 

 

Answer:

The flash data state at any given location can be trusted only if every flash operation that has run on that location, since the most recent block or sector erase, is known to have started and run to completion while under continuous power. Here, a location is a collection of bits ranging from one bit to the maximum number of bits within the scope of a given flash operation. 

If any of the flash operations in this sequence has been interrupted for a location, the flash data state in the location is indeterminate and cannot be trusted for further reading or programming. In other words, any stable or unstable data state is possible, ranging from all zeros to all ones and anything in between – including an apparently correct image of the intended data. It is the system’s responsibility to detect interrupted operations and to perform correct recovery actions. 

  1. Recovery for interrupted erase: erase the block or sector again. 
  2. Recovery for interrupted program:   
        
    1. If the source data exists, program the data again in a new location. 
    2.   
    3. Mark the location subject to interrupted programming as “untrustworthy” and do not use that location for r eading or programming until the block or sector containing this location can be erased again. 
    4.  

Some sources recommend this recovery for interrupted programming: 

  • If the source data exists, program the data again to the same location.

However, this method cannot be proven to always yield a trustworthy data state. So the recovery #2 stated above is the safest recovery for interrupted program. 

The following approach is the best way for the system to identify locations where erase or program recovery actions must be taken: 

  1. Program a non-volatile location with a value that signifies that the flash operation will start.
  2. Start the flash operation and observe that it runs to completion under continuous power. 
  3. Program a non-volatile location with a value that signifies that the flash operation has completed.

Upon restart, the system can scan for any starting marks that do not have a corresponding ending mark. Then the system can take appropriate recovery actions as stated above.

Commercial and open source flash file systems use methods like this to eliminate untrustworthy locations from processing – flash file system solutions from Blunk Microsystems, Datalight, Kyoto Software Research, and QNX are known to follow the Cypress best practice. Linux, when used with Cypress patches, also conforms to the Cypress best practice. 

Some Cypress flash devices have this logic built into the erase algorithm (US 9,378,829). The first such device is the S25FS-S family of serial NOR flash. Subsequent 65-nm and 45-nm serial NOR and parallel NOR devices from Cypress also have this feature – look in the datasheet for the “Evaluate Erase Status” command. This command checks for the start/finish marks associated with each flash sector so the system can quickly identify any sectors that have been subjected to interrupted erase.