I have some problems to make the OTA process in the field.
In two times, during the OTA process, the BCM4343W are rebooting by whachtdog timer. And after try to boot the module again, they don’t work.
The message that appear it’s showing below:
Starting WICED v5.0.0
Started ThreadX v5.6
Initialising NetX_Duo v5.7_sp2
Creating Packet pools
WWD SDIO interface initializing with US/0
Fatal error: download_resource cannot load with invalid size
Could not initialize bus
Could not initialize wifi platform
And the I can’t recover the module.
Clue: "The issue that’s been identified here of two boards getting “bricked”appears to be related to inadequate servicing of the watchdog timer during an OTA update."
Any suggestion to solve this issue?
Thanks in advance!
I think you are using the console app with mfgtest FW.
Can you confirm is there the mfgtest FW binary file present on the below resource location:
Also, please specify what do you do in the OTA process?
What is the application that you are trying to upgrade to? (Kindly provide the size and a little details about it). Based on the terminal log, it seems that the OTA upgrade happened and then the fw download (refer to wwd_bus_coomon.c) is failing because the resource size is coming to be <0 which can happen due to corrupted image in the external flash. If you have physical access to the bricked devices, you can follow this Unbricking of CYW943907 to recover your boards.
Regarding watchdog issue (as you think that is causing it), I can check in 43xxx_Wi-Fi/WICED/platform/MCU/STM32F4xx/STM32F4xx.mk, the maximum watchdog timeout seconds is 22. You can either modify this or you can add a global defines in your make file WICED_DISABLE_WATCHDOG during OTA process to figure out whether this bricking is happening due to WATCHDOG servicing/.
Customer shared his OTA file to help us to identify the issue.
By our side, we work with ST and together with ST we identified that the OTA customer file have for some reason over write the ST area design to our Boot sector to Avnet module.
We understood that it would be avoid if our Boot area at STM32 MCU would have an Read out Protection feature. Would we be able to provide and custom file to customer with this feature added?
Besides that, customer has been facing other error after OTA .
Customer is receiving the error, follow customer description:
error in writing data made by wiced_framework_app_write_chunk
We notice that in a considerable number of pieces this writing fails happen without any flag (Silently)..
The data is not spelled correctly in DCT_APP0_INDEX.
So Customer introduced an extra check at the end of the image receiving process.
The new_fw_has_valid_checksum (line 485 of wiced_ota_server.c) its reads the saved area, recalculating the CRC16 and comparing it with the received one. When the CRC doenst match, the update process is canceled.
Note that in many parts the process runs smoothly. Checking the written data returns success and the new firmware can be started. But in a considerable number of pieces (4 in 11 pieces that we are testing), verification does not match and the process cannot proceed. If we remove the check from new_fw_has_valid_checksum and let the process follow, the image that is used is not functional.
Customer has a release process to next week.
The part that presents the writing problem can be retrieved by writing using JTAG.
Would it be the case to remove this functionality (Fw update OTA) from Novus product?
Since we have two problems in this functionality and we could not yet an solution to guarantee that it do not happen in field.
What would be the steps you would suggest to take?
Thanks in advance,
Can you please make sure the transfer is complete by checking the return type of wiced_framework_app_write_chunk()?
The approach of adding a checksum at the end of the updated image does take care of ensuring the complete image is received. (This case is not properly handled in WICED as the return type of wiced_framework_app_write_chunk() is not checked anywhere).