This Subject summarizes the voluminous “Reset Recovery Considerations” Posts.
I am writing this as a benefit to others, and also for myself since the recap effort refreshes the points in my mind.
The premise is to find ways to deal with resets that occur. Prevention of resets should also be made where possible. The cost of prevention can be high, particularly with environmental EMI caused resets, some of which may be in equipment not related to the embedded system.
A Watchdog Reset can be just as disruptive as the other resets; development testing should explore how close to the WDT threshold you are, with remedial efforts applied when the “cushion” is too low. A production system WDT event recovery is a user consideration; does it represent a solid failure or just a single occurrence. CyResetStatus is useful to report WDT events.
In the process of analysis of resets, a technique was developed to find where in code a reset was occurring. A byproduct was to be able to find those areas of code that were heavy consumers of MCU capacity. The technique is discussed below.
Resets fall into two supersets. Some existing documents contain misstatements. From Cypress Tech Support:
- “Software reset and watchdog reset fall under soft reset sources.”
- “XRES, IPOR, PRES, LVI and HVI sources fall under hard reset sources.”
Only Soft resets are reported via CyResetStatus. WDT implementation is counter intuitive in that it is a soft reset.
LVI, as a hard reset source, can be specified to produce either a reset or an interrupt (which has user code insertion spots). The choice should be obvious.
Reset recovery should be tempered as to whether it represents an unrecoverable event (power supply failure), or a transient condition (EMI, Power Failure without backup power).
Reset recovery has the objective of restoring the system to the state where it was when a reset occurred while reducing the time of the outage. Critical systems may need some variety of backups. While a recovery is underway, some means is required to prevent making interim actions effective until the recovery is complete.
I use the 32K external crystal for an external clock for accurately timing components and producing a DIY RTC. The startup time for the 32K oscillator is a consideration. I am exploring using an external RTC Module with a small coin battery backup. A configurable square wave output is available for use within PSoC3 (may need synchronizing). Having a battery backup makes the RTC always available and oscillator startup delay is eliminated.
While it is possible to control components without saving their control parameters, it should be discouraged. The current value of these is the key to quickly return the system to the state when the reset occurred. The use of EEPROM is not a viable answer due to duty cycle and to its large but limited lifetime. The use of SRAM can be the answer, but the default action of a most resets is to Clear SRAM. There is a .cydwr System option to not clear SRAM on startup (and resets), which appears to be the desired choice.
A method of determining if a POR Reset occurred is required. A POR requires establishing the system; otherwise a recovery is the method of choice. But POR is not reported so other detection is required. Examining a configuration SRAM entry could provide the differentiation. I employ a Global, “RUN”, which is used to switch between the backup and PSoC3: if it is set, then PSoC3 has been in control when the reset occurred. This applies to all hard resets; thus recovery is the same.
Most resets invoke INIT.A51 whose purpose is the restore initialized SRAM variables to their initial value. Initialized component control parameters stored in SRAM would negate being able to use them during recovery. Avoid initialization for recovery sensitive SRAM variables.
Any dependent SRAM variables would not need to be recalculated because not clearing SRAM would leave them intact. They should not be initialized as a result of typical coding practices (else INIT.A51 will wipe them out).
Most resets return components to their DWR settings. There is no option to not do the reconfiguration. Therefore a recovery task is to return the component settings to SRAM values. One reason why not clearing SRAM is desirable. The component reconfiguration for soft resets is a superset of hard resets; last step for soft is to re-establish the components to their last known state.
Knowing where in code a reset occurred may assist in the recovery or remedial action. It may require an additional set of steps. If the “where” included more than one possible action that might cause a reset (i.e., turning motors on or off ), then selectively inserting a digital output pin instruction, that is used as a DSO trigger, allows finding which of the actions is causing the reset. GlobalSignalRef might assist this by providing a signal rather than an instruction, but interpreting it based on time is more of an art.
The technique alluded to above was initially developed with “where” as the goal, but has evolved into a twofold purpose. The second purpose was to be able to statistically determine the heavy MCU users. I refer to the technique as CallName logging/sampling. It uses ENUM ordinal entries of the CallNames (with a suffix to make the compiler happy) and those calls of interest being prefixed with the “suffixed CallName” saved to a Global and a post call to restore the Global to the name of the caller. The Global thus contains the name of the called routine during its execution, and being restored at the completion. If the calling routine is not of sufficient interest, the post call could restore the Global to a “not of interest” indicator, zero fills the bill). The Global is only available after a reset if SRAM is not cleared and the Global is not initialized. If you decide to retain Clear SRAM, it would require Generated Code modifications to retain the Global (a Case has been entered to request such mods including Generated Code as possible participant).
Statistically sampling of CallNames; by using DMA, it is possible to sample the CallName Global with a destination of one of two places. An IDAC8 can be used with its output to an analog output pin; an analog scope monitoring the pin would show, via intensity, where most time is being spent. A 1 byte CallName Global output (limited to 256 values) should be discrete enough to see individually on the scope. If an “off board” communication link is available, then a UART (or other link) destination allows for a data stream for a PC analysis to produce a histogram and a usage ranking output. If the PC program were to also read and parse the CallName ENUM, the outputs could be more user-friendly. DMA sampling frequency will need some empirical testing to give useful results. If the UART is the method of choice, I recommend “wading” though the original Posts to learn a simplified DMA UART methodology (it will be found in Page 3).
AN60630 shows ways to optimize PSoC3, an excellent reference. However, it is time and co$t intensive to apply the optimization to all portions of a project. The old 80/20 rule says find that 20% of the code which consumes 80% of the capacity and your development Dollar$ will be reduced doing optimization. CallName sampling can show statistically which of the calls are consuming the most capacity. Optimize the worst and work downward until the co$t vs. reward says stop.
Resets understanding has been a long road to travel but with the welcome help of other Forum contributors; sometimes just asking a question can lead to analyzing an area that might not be considered.
Now I need to apply these guidelines retroactively to my project.