GKI Exception - out of buffers when using BLE scan and TLS for HTTP simultaneously

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
agmi_3321141
Level 4
Level 4
5 likes given First like received First like given

Hi,

I am facing this problem where my code that runs okay for HTTP requests, fails when I use the HTTP_USE_TLS flag to change to HTTPS requests. I am scanning for BLE packets(using BLE observer) and sending them over HTTP. I see GKI exception - out of buffers a few moments after starting the scan and the system hangs. I tried changing the number of buffers based on the document "Application buffer pools" but that did not help. At first glance, it seems this has something to do with use of TLS with BLE. Hope you can offer some help debugging this.

0 Likes
1 Solution

So one the things I found helpful was to reduce TLS version to 1.1 or lower [1.2 straight up does not work with BLE scanning], and reduce the priority of my HTTP thread to below that of the worker thread that is working on the TLS part. With thread priority of 5 I still see the mbed record write failures, so I seems like the priority needs to be lower than that too.

View solution in original post

9 Replies

Hi shjl​,

Both threads point to changing the buffer pool count, but I mentioned in my original post that it didn't actually help me. I have tried to increase small and medium buffer pools based on the document "Application buffer pools". Also, this happens only when I enable TLS (default - v1.2). From the document it looks like the buffer pools are for BT, so how is it affected by TLS?

0 Likes

shjl

The GKI Exception still can happen sometimes (even without TLS).

Increasing buffer pool count does not help, and there is nothing the user can do with this issue.

It might be a bug in the closed source BT library.

This issue has been reported long time ago, but I don't see any bug fix regarding this in CHANGELOG.

Also note, the device no longer work once GKI Exception happen. It's indeed a problem that needs fix.

0 Likes

Hi shjl​,

After compiling with TLS v 1.1, I keep getting bus faults on the worker thread that is handling the TLS stuff. The same worker thread that gives me Mbedtls_ssl_write_record failed is the one that ends up with a bus fault. When I get a precise fault, the program counter takes me to 0x803178a which is somewhere in the aes file under AES_RROUND macro. Have you observed this behavior and what can be done about this? I'm not sure if my application has anything to do with this.

0803173a:   ble.w 0x80318c4 <mbedtls_aes_decrypt+544>

0803173e:   ldr r5, [pc, #812]      ; (0x8031a6c <mbedtls_aes_decrypt+968>)

08031740:   ldr.w r8, [r5, r10, lsl #2]

08031744:   ldr.w r5, [r7, r6, lsl #2]

08031748:   ldr r6, [pc, #800]      ; (0x8031a6c <mbedtls_aes_decrypt+968>)

0803174a:   eor.w r8, r8, r5

0803174e:   ldr.w r5, [r0, #-32]

08031752:   eor.w r8, r8, r5

786              AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 );

08031756:   ubfx r5, r1, #8, #8

0803175a:   mov.w r9, r3, lsr #24

0803175e:   ldr.w r5, [lr, r5, lsl #2]

08031762:   ldr.w r9, [r6, r9, lsl #2]

08031766:   eor.w r5, r8, r5

0803176a:   ubfx r8, r3, #16, #8

0803176e:   mov.w r10, r1, lsr #24

08031772:   ldr.w r8, [r12, r8, lsl #2]

08031776:   ldr.w r10, [r6, r10, lsl #2]

0803177a:   eor.w r8, r5, r8

0803177e:   uxtb.w r5, r11

08031782:   ldr.w r5, [r7, r5, lsl #2]

08031786:   eor.w r9, r9, r5

0803178a:   ldr.w r5, [r0, #-28]

0 Likes

Hi shjl​, do you have any additional inputs based on the error reported above? I almost consistently get a hard fault with the program counter around the AES_RROUND macros

0 Likes

So one the things I found helpful was to reduce TLS version to 1.1 or lower [1.2 straight up does not work with BLE scanning], and reduce the priority of my HTTP thread to below that of the worker thread that is working on the TLS part. With thread priority of 5 I still see the mbed record write failures, so I seems like the priority needs to be lower than that too.

AxLi_1746341
Level 7
Level 7
10 comments on KBA 5 comments on KBA First comment on KBA

vmore_3321141

Since you are using ble observe API, you might want to know this issue, FYI:

Re: sdk-6.1: Setting scan_mode=BTM_BLE_SCAN_MODE_PASSIVE does not work

shjl

I appreciate to hear you comment about "Setting scan_mode=BTM_BLE_SCAN_MODE_PASSIVE does not work" issue.

I had reported this regression quite long time ago, however no fix and no response from cypress.

I'm not sure if there is a new way to set passive scan mode or something else.

I cannot figure out a way to make BTM_BLE_SCAN_MODE_PASSIVE work with wiced_bt_ble_observe().

0 Likes

shjl​ I changed the TLS version in wiced_default.h to 1.1 and the application did not crash right away as was happening when the TLS version was 1.2. However I do see a error prints once in a while : Mbedtls_ssl_write_record failed

along with a out of buffer exception once in a while. I was running it in debug mode so the out of buffer exception did not totally crash the app, but after a while the program encountered a hardfault. These hardfaults were sometimes Instruction Bus Error, or Precise Data Bus Error (axel.lin_1746341​ not sure if this is related to your BLE passive scan problem, or something in TLS)

The backtrace produced this output:

bt

#0  0x080152ba in _tx_thread_system_return ()

#1  0x080141da in _tx_thread_system_suspend ()

#2  0x08016c68 in _tx_semaphore_get ()

#3  0x08014c42 in _txe_semaphore_get ()

#4  0x0800ce66 in host_rtos_get_semaphore (semaphore=0x2000b9e0 <system_monitor_thread_semaphore>, timeout_ms=5000, will_set_in_isr=WICED_FALSE) at WICED/RTOS/ThreadX/WWD/wwd_rtos.c:241

#5  0x08025d68 in wiced_rtos_get_semaphore (semaphore=0x2000b9e0 <system_monitor_thread_semaphore>, timeout_ms=5000) at WICED/RTOS/ThreadX/WICED/../../wiced_rtos_common.c:253

#6  0x0801e93a in system_monitor_thread_main (arg=0) at WICED/internal/system_monitor.c:145

#7  0x08017054 in _tx_thread_shell_entry ()

#8  0xffffffff in ?? ()

Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Adjusting buffer pool count is not really helping here as the program always ends up hitting a hardfault.

0 Likes

I do have experience in hitting "Instruction Bus Error, or Precise Data Bus Error" but I have no idea what's going on.

0 Likes