Our supplier told us that they had good experiences with the 20736S, whereas they haven't been selling many 20737S so they have less experience there. Also from what I've heard there won't be a 20737E. Since we would like to improve range a bit and because of the buggy parts We decided to go with the 20736S for now.
Unfortunately I don't have time for more research.
Thanks for that. Have you had a successful production run with the 20736S, and seen the problem disappear?
Note that the 20736S and 20737S are virtually identical parts with the exception of the Polar SSL stack that can only be enabled on the 20737S.
With that said, this user did find that the platform files (20736S vs. 20737S) in their particular design madea difference: rtc_sample bricks our BCM20736S-based board
Many others interchange these files and never have an issue. Not sure why it worked here.
However, this would not be issue bobcat_1857246 is seeing that is related to what appears to be a 2% failure rate.
For this issue, have you tried forcing the part to never sleep in order to see if the issue still occurs? If this works, then maybe increase the Xtal warmup period to see if that helps?
Unless there is a way to reproduce the failure use case, it may be tough to track down the root cause.
We were well aware of the XTAL warm-up scenario on the original 20732S devices in which I had actually thought this had gone away with the newer 20736S and 20737S devices. I have recently had 2 customers I have been working with have boards that initially were failing but with updating the XTAL Warm-up time in the mandatory.cgs files in the SDK to 5000 has resolved both situations.
ENTRY "PMU Crystal Warm up Time"
"Crystal warm up time" = 5000
The symptom we saw in both cases was that the boards would initially program fine and the BCM firmware would run very briefly after initially being programmed but then would fail. Some of the boards I could see HCI Debug Trace output upon a RESET and would stop executing after what appeared to the the 'xxx_create()' function certainly before the first 1 second timeout would occur as we would never see that. On the other project I put in a simple LED toggle to validate it had been programmed/was executing and I would see the LED blink once and it would stop execution. Both very similar symptions, very brief execution and would then stop executing.
Another HINT: In my experience if you have devices that need to have the XTAL Warm-up time updated to 5000 and have already been programmed once you may need to put them in 'Recovery' mode to be able to re-program to update this setting.
To put the devices in 'Recovery' mode: (Forces BCM2073xS to boot from ROM instead of EEPROM or SFLASH)
1) You can simply Take the SDA line and connect a wire-jumper to it
2) Tie the SDA line to your VDD source.
3) Hit RESET on the BCM2073x/BCM2073xS.
4) In your 'make target' in the SDK do a copy of your current 'make target' and then paste back into the SDK at the top. It will indicate this is a 'Copy' of your other make target.
5) Replace the 'download' tag in the make target with 'recover' and use this make target to update the BCM20736S.
Here is an example...you don't have to put the UART=COMx designator but I like to do this myself as Ii sometimes have multiple USB-UARTs in use.
6) If successful you will see a 'Successful Recovery' message printed to the 'Console' tab. Sometimes you do have to repeat this a couple of times so don't give up if the first try does not work.
We have ten new devices with the 20736S, but I haven't gotten around to checking them yet. I do have another project on my hands as well.
I was hell bent on getting a spectrum analyzer to maybe sniff out what sort of signal source would behave strangely on the chip. I have never tried this sort of forensics before. Maybe one could derive a better test that way.
Note that we found out today from the developers that some of our tier 1 customers have actually run into isolated instances where the crystal warmup period needs to be increased to 6000, or slightly more (even 7000). If more than 7000 is needed, then there is something else that is causing the issue.
Thanks for your suggestions. I just tried the 7000 crystal warmup, after using a recovery mode to make sure it really takes, and it did not help the problem. Behind the scenes, I'm trying to get our rep to take one of these modules back to Broadcom, to see what they can find. I have a couple of modules that exhibit this problem, even with stock "hello_sensor" code in them.
Unfortunatley, I'm not sure who inside our company would be setup to debug custom customer designs.
I have good news: SDK 2.2.1 that was just released solves my problem. Both Hello_Sensor and my own code would stop advertising and connecting after 30 second of operation, on 4 out of 150 modules we built. Now, with SDK 2.2.1 both Hello_Sensor and my own code are continuing to run for extended periods on all modules.
We are now testing for stability over longer periods, and with heavy operation.
Very good! Thanks for your efforts. It boggles my mind what kind of software issue would exacerbate parameter deviations in hardware.
EDIT: After some testing the new SDK appears to have fixed the issue.
You could actually desolder the chip and test it in whatever test fixture you have. I was fairly close sending one of our boards back to you too.