Strictly necessary cookies are on by default and cannot be turned off. Functional, Performance and Tracking/targeting/sharing cookies can be turned on below based on your preferences (this banner will remain available for you to accept cookies). You may change your cookie settings by deleting cookies from your browser. Then this banner will appear again. You can learn more details about cookies HERE.
Strictly necessary (always on)
Functional, Performance and Tracking/targeting/sharing (default off)
We've observed a deadlock in sys_sem_new() using WICED SDK 3.3.1
The problem stems from an interaction between the heap semaphore used by malloc_lock()/mallock_unlock() to guard the heap structures against concurrent access, and the critical section used in sys_sem_new(), where the code in the critical section also tries allocating from the heap.
The sequence of events looks like this:
some thread A, starts to perform a heap allocation
as part of the allocation, it acquires the heap mutex
before thread A releases the mutex, another thread S starts a call to sys_sem_new(). Thread S starts a critical section, stopping all interrupts and task switching. Thread A still owns the heap mutex, but makes no progress.
thread S then attempts to allocate heap for the new semaphore. It blocks waiting for the heap mutex to be available, but this is never available since it's acquired by thread A, which cannot make any progress until the critical section is exited.
thread S only exits the critical section after it's acquired the semaphore owned by thread A
This is the typical cyclic graph of resource acquisition between threads that causes a deadlock.
A trivial fix is to call malloc_lock()/malloc_unlock() around the critical section in sys_sem_new(). This will ensure that the thread has the heap mutex before entering a critical section, so is guaranteed not to block in the critical section when allocating memory.
To fix this, I would prefer ideally to remove the critical section from sys_sem_new(). My understanding is that no other thread could gain access to the semaphore (unless one maliciously starts enumerating internal structures), so there's no need to enter a critical section in order to take the semaphore.
I don't see from the code that it's needed, but very much welcome help from anyone who can shed some light there.