BCM20737 will not boot

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
Anonymous
Not applicable

We have have BCM20737 units that fail to boot on reset. The micro reads a small amount of data from EEPROM and does not continue to read out the entire application.

From analyzing the static sections(SS1 and SS2) of these units I observe the first byte of SS1 is set to 0x00 while the first byte of SS2 is set to 0xFF. In working units atleast one of the static section has the first byte set to 0x01. Could this prevent the micro from reading in the active partition from EEPROM? The units are running the 2.2.1 SDK.

Thanks for the help.

0 Likes
1 Solution
JacobT_81
Employee
Employee
250 replies posted 100 replies posted 50 replies posted

From analyzing the static sections(SS1 and SS2) of these units I observe the first byte of SS1 is set to 0x00 while the first byte of SS2 is set to 0xFF. In working units atleast one of the static section has the first byte set to 0x01. Could this prevent the micro from reading in the active partition from EEPROM? The units are running the 2.2.1 SDK.

The first 11 bytes of the SS are check-summed on boot. If the checksum fails, the board will increment through certain locations in memory until it finds a valid SS to use. If it doesn't find one, it will only boot from ROM..

The first three bytes of the SS you are using should always be 0x01 0x00 0x08. Whichever SS your working boards are using should contain this sequence. As you noted, your bricked boards do not contain this sequence. Corruption of this area is likely the cause of your boot failure.

Can you confirm that you're performing OTA updates to these boards? It sounds like a failure is occurring in that process. 

Since reproducing this error is difficult, for the time being I recommend that you move to SDK 2.2.3. This latest SDK addresses issues of NVRAM corruption like you're seeing.

To utilize these fixes, please use 2.2.3 and add to your makefile:

     APP_PATCHES_AND_LIBS += config_nvram_fixes.a

Jacob

View solution in original post

9 Replies
JacobT_81
Employee
Employee
250 replies posted 100 replies posted 50 replies posted

Are you able to perform a recovery on the boards?

Are these custom boards you're using?

Jacob

It does sound like the device is in Recovery mode (more on bootup found here: More WICED Sense Recovery Questions

They are using SDK 2.2.1.... I don't have the release notes for SDK 2.2.2  or 2.2.3 but wasn't there a bug fix to prevent EEPROMs from getting trashed in one of the later SDKs.

Are you able to force the failure to occur, or is it random?    Can you try your application with SDK 2.2.3?

Anonymous
Not applicable

Hello all,

Sorry for the late reply. These are production units on a custom board. Performing recovery on them is not an option. I doubt the EEPROM are getting thrashed as a diff performed on the EEPROM content of a working unit and a bricked unit only shows a change in a single bit (as described above).

We are not able to force the failure to occur, it has only happened to a few units in the field. Even if I try our application with 2.2.3, I can not test for the fix since i can not reproduce the issue.

Can you guide me to the low level boot sequence of the chip. Specifically in regards to reading the static sections.

Thanks. 

0 Likes
0 Likes
Anonymous
Not applicable

Can I get more detail on this step "Boot ROM checks EEPROM for a valid configuration", this is from the second link you posted. Specifically the significance of data in the static sections.

0 Likes
JacobT_81
Employee
Employee
250 replies posted 100 replies posted 50 replies posted

From analyzing the static sections(SS1 and SS2) of these units I observe the first byte of SS1 is set to 0x00 while the first byte of SS2 is set to 0xFF. In working units atleast one of the static section has the first byte set to 0x01. Could this prevent the micro from reading in the active partition from EEPROM? The units are running the 2.2.1 SDK.

The first 11 bytes of the SS are check-summed on boot. If the checksum fails, the board will increment through certain locations in memory until it finds a valid SS to use. If it doesn't find one, it will only boot from ROM..

The first three bytes of the SS you are using should always be 0x01 0x00 0x08. Whichever SS your working boards are using should contain this sequence. As you noted, your bricked boards do not contain this sequence. Corruption of this area is likely the cause of your boot failure.

Can you confirm that you're performing OTA updates to these boards? It sounds like a failure is occurring in that process. 

Since reproducing this error is difficult, for the time being I recommend that you move to SDK 2.2.3. This latest SDK addresses issues of NVRAM corruption like you're seeing.

To utilize these fixes, please use 2.2.3 and add to your makefile:

     APP_PATCHES_AND_LIBS += config_nvram_fixes.a

Jacob

Anonymous
Not applicable

Hi jakewtorres

Thank you for the response. Yes, we are performing OTA upgrades on these boards. Is it possible to back port these fixes to the SDK we are currently using(2.2.1)?

Thanks.

0 Likes

In theory you should be fine pulling the .a file you need from 2.2.3 and adding it to 2.2.1. But I can't officially recommend doing this since the code didn't undergo verification in 2.2.1.

Jacob

karthik_1888801

Posting a correction on Jacob's reply, the optional patch config_nvram_fixes.a in SDK 2.2.3 does not affect OTA upgrade and there should be no issues during OTA. This is rather an update to the NV storage algorithm – the reads/writes that result from bleprofile_*NVRam() API functions.

This patch is required if there is a possibility to lose power or if there is a chance that the battery voltage will dip during NV erase or write. If such a scenario exists, and this patch is not included, there is risk of losing all NV items – all pairing info and any other info the app stored using bleprofile_*NVRam() may be lost.

You can find the affected API functions bleprofile_ReadNVRAM(), bleprofile_WriteNVRAM(), bleprofile_DeleteNVRAM() in <SDK>/include/bleprofile.h.

Karthikram,

Were you able to find the root cause of the corruption and/or solve the issue? If you could update us on your findings that would be great!

Thanks,

Jaeyoung