Connection drops after a while -- trying to figure out how to debug

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
legic_1490776
Level 5
Level 5
25 likes received 10 likes received First like received

I am writing an application for the 20732S in which my slave sends a series of notifications to the master.

The connection is established OK, and it works fine for a while, once per second sending 2 or 3 notification messages, and then sending nothing for about 1 second.  The data is received OK on the master.  (In this case it is a galaxy S3).  The connection interval is the standard rate, about 50 ms. Between each set of notifications there are about 19 empty connection intervals.

I can see the connection on a sniffer and I can see the packets being transmitted.  However, after a while, just as the next set of notifications is about to be sent, the connection breaks down.  The Master continues to attempt to send an empty PDU on each connection interval, but no response is sent anymore by the slave device.  Neither side of the connection seems to be aware that the connection has broken down.  On the android side I have a timeout that tries to close the connection but the empty PDUs keep being sent from the master.  On the tag side, everything appears to be fine as well - my code continues to push notifications and the connection down callback does not get called until much later it is eventually called.  Note that I have the connection idle timer set to '3' (seconds I guess?), but this doesn't seem to get triggered.

So it appears that something in the stack is wedged and maybe is continuing to receive connection interval PDUs, but is failing to send any packets in response, and not tripping the connection idle timeout either.

I'm setting the connection idle timeout to 3 in the configuration structure -- not sure if this actually has any effect.  I'm not sure how I should set be setting the connection idle timer using emconinfo_setIdleConnTimeout(), and if I were to use this function, what unit is the timeout expressed in?  I could not find this documented.

Anyway - how can I debug this?  How can I tell whether the tag is receiving the PDUs and whether it is sending anything in return?  Am I setting the connection idle timer correctly and if so what should I expect it to do?  If the device is getting wedged, is there any workaround?  Is there something I might be doing to cause this?  It does appear that the failure is correlated with my code attempting to send packets.

0 Likes
1 Solution

I have not tested re-enabling interrupts with the longer crystal warmup time.

View solution in original post

0 Likes
5 Replies
legic_1490776
Level 5
Level 5
25 likes received 10 likes received First like received

UPDATE: I am pretty sure this is caused by a bug in the bluetooth stack / OS.  My guess is that it is a race condition or timing issue triggered by a GPIO interrupt firing at an inconvenient time. 

I first tried changing the timing of notification sends, sending on the next fine timer callback, but this had no impact. However, by disabling interrupts and instead polling the SPI device off a fine timer, the problem goes away.  This suggests to me that the problem is related to interrupts. 

This temporary fix means connections last a lot longer, but it points to real stability issue any time you use interrupts.

0 Likes

Fine timer callbacks and GPIO interrupts are both serialized as callback events to the application thread. So this is unlikely to be a problem with interrupts. Are you using the tag board for your tests or is this a custom design with a module?

0 Likes

>> Fine timer callbacks and GPIO interrupts are both serialized as callback events to the application thread.


This is why I believe that the problem is actually a bug in the stack having to do with an interaction between the GPIO ISR and the bluetooth stack: when I disable interrupts and invoke my handler from the fine timer rather than from the interrupt, the problem disappears.  But these two cases should be equivalent at the application level since they are both serialized. 


>> Are you using the tag board for your tests or is this a custom design with a module?


It is a custom design with a 20732S module.  The interrupts are generated by an external component.



0 Likes

Hi ldgirod,

Did this problem go away with the crystal warmup work-around activated?

0 Likes

I have not tested re-enabling interrupts with the longer crystal warmup time.

0 Likes