BLE Mesh is getting stuck in Central Mode

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
AnYo_3398046
Level 2
Level 2
First like given

Hi Everyone,

I have taken over an associate's project working with the BLE Mesh, which I found they built upon one of the Cypress examples. I believe THIS is the example project. I have constructed a larger mesh system, about 50 BLE devices, using this mesh. About every 12th message (sometimes more, sometimes less), I get one device that gets stuck in Central mode. Since the device is in Central, I cannot access it via Bluetooth as it is not advertising, basically rendering the device useless to me. The only method to get the device back up and working is by cutting the power and turning it back on.

To move forward with the diagnosis, I needed to prove to myself that I was in fact in Central mode. So, I started using a timer to test this. I have added UART lines inside of main.c which will print a line to my output for every 20 timer cycles that ble_gap_state == Central Mode. The variable "ble_gap_state" is in the original example project, which is either equivalent to BLE_CENTRAL or BLE_PERIPHERAL. As a note, I only send a UART message every 20 cycles so that I don't slow down the MESH devices that are working properly.

Using the Debug lines, I have confirmed that I stay in BLE_CENTRAL when the device is unresponsive to messages. Further, I use the cypress function "CyBle_GetState()" to determine the state at the time I get stuck, which is "CYBLE_STATE_DISCONNECTED". At first I though it was possible that my android and iOS devices were failing to disconnect, but that is apparently not the case.

Any idea on why I am getting stuck in Central? I can likely do a soft reset if the timer reaches a certain point and i'm in Central + disconnected state, but i'd rather avoid that if I can properly address the issue. Either the stack is messed up, or the Watchdog timer is failing to count properly. If the watchdog is the issue, then a soft reset won't do anything to help.

As I see it, if the watchdog timer fails, the below code won't switch me back to the Peripheral role. As I know, the ble_gap_state is BLE_CENTRAL, and the state is CYBLE_STATE_DISCONNECTED. The only line I haven't determined in the below code is the line I have put in bold. Then again, it may or not even be the watchdog timer at all!

/* If the current role is Central and the Central time has exceeded the preset time,

* then set the flag to switch role to Peripheral */

if((BLE_CENTRAL == ble_gap_state) &&

    (WatchDog_CurrentCount() - centralStartedTime > CENTRAL_STATE_SPAN) &&

    ((CYBLE_STATE_DISCONNECTED == CyBle_GetState()) || (CYBLE_STATE_SCANNING == CyBle_GetState())))

{

    /* Switch role flag set, swtching us from BLE_CENTRAL to BLE_PERIPHERAL*/

    switch_Role = TRUE;

    return;

}

Any thoughts on this? Thanks in advance!

0 Likes
1 Solution

During further testing, I was able to modify the current count on the watchdog counter to be over the limit mentioned above, 65,490, and sent a message to the device. It remained in Central mode until I reset it manually by cutting the power. I received the same UART messages as before when the error first occurred randomly.

Solution and Explanation Below

The error begins at this line in ble_process.c:

    centralStartedTime = WatchDog_CurrentCount();

In the instance that centralStartedTime is set to a count of over 65490, we remain in Central until the power is shut off. This is because the following line will never be true:

    (WatchDog_CurrentCount() - centralStartedTime > CENTRAL_STATE_SPAN)

If central started time is 65490 and the watchdog meets it's maximum value, 65535, the difference will only be 45. Since the central_state_span is exactly 45, the statement will be false. From there, it only gets worse, since the watchdog counter resets back to 0.

To remedy this, I have added the below code beneath the initial line:

// This line is the current code

centralStartedTime = WatchDog_CurrentCount();

// The below have been added as a solution

if(centralStartedTime >= 65480){                  

    current_watchdog_counter = 0;

    centralStartedTime = 45;

}

I hope this fix is useful to someone in the future!

Best Regards,

Andy

View solution in original post

0 Likes
2 Replies
AnYo_3398046
Level 2
Level 2
First like given

Hi All,

I just did some testing of the watchdog counter. It appears that the watchdog cycles between the values of 0 and 65,535. Knowing this is helpful because of the line:

(WatchDog_CurrentCount() - centralStartedTime > CENTRAL_STATE_SPAN)

The variable "CENTRAL_STATE_SPAN" is 45.

Doing a little bit of thinking here - If centralStartedTime is set to 65,490 or higher, the above condition will NEVER be true, considering the highest value of the wachdog count will be hit at 65,535 and then return to 0. I can't be sure this is the entire issue or not, but is worth testing. The best solution to this, that I can think of, is simply setting centralStartedTime to 0 if WatchDog_CurrentCount() >= 65,490. In this rare case, the downside to my solution is that we could be in central for a max of 90 instead of 45. This is essentially 9 seconds instead of 4.5 seconds.

0 Likes

During further testing, I was able to modify the current count on the watchdog counter to be over the limit mentioned above, 65,490, and sent a message to the device. It remained in Central mode until I reset it manually by cutting the power. I received the same UART messages as before when the error first occurred randomly.

Solution and Explanation Below

The error begins at this line in ble_process.c:

    centralStartedTime = WatchDog_CurrentCount();

In the instance that centralStartedTime is set to a count of over 65490, we remain in Central until the power is shut off. This is because the following line will never be true:

    (WatchDog_CurrentCount() - centralStartedTime > CENTRAL_STATE_SPAN)

If central started time is 65490 and the watchdog meets it's maximum value, 65535, the difference will only be 45. Since the central_state_span is exactly 45, the statement will be false. From there, it only gets worse, since the watchdog counter resets back to 0.

To remedy this, I have added the below code beneath the initial line:

// This line is the current code

centralStartedTime = WatchDog_CurrentCount();

// The below have been added as a solution

if(centralStartedTime >= 65480){                  

    current_watchdog_counter = 0;

    centralStartedTime = 45;

}

I hope this fix is useful to someone in the future!

Best Regards,

Andy

0 Likes