9 Replies Latest reply on Oct 17, 2017 10:28 AM by kostis_2240836

    ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE

    kafouros

      Hi everyone,

       

      WICED 4.1.0 cannot connect to Azure IoT Hub over TLS-based MQTT connection due to WICED's inability to establish a TLS channel.

       

      Direct connection has never worked for us since our product was based on WICED 3.5.2. That's why we have spinned off TLS proxies (NGINX), while waiting for a fix. However, it seems that neither 4.1.0 is solving the issue, so here is some investigation that may help you with debugging.

       

      We took Wireshark traces for the following scenarios:

      1) Connected MQTT.fx app directly to Azure IoT Hub (OK)

      2) Connected WICED's secure_mqtt app directly to Azure IoT Hub (Not OK)

      3) Connected WICED's secure_mqtt app through NGINX proxy (OK)

       

      After looking into the wireshark traces, we could see that there is a one major difference between our NGINX proxies and Azure IoT Hub. Along with the "Server Hello" message, Azure IoT Hub is also sending "Client Certificate Request" to the client specifying that a client cert is requested.

       

      However, the MQTT authentication in Azure IoT Hub is not based client certificates, but rather on credentials. In such a case, the client is not expected to send any cert. This is where the trace from MQTT.fx are useful. What that app did to deal with this issue was to simply answer, with no certificates. In fact, according to RFC5246:

      If no suitable certificate is available, the client MUST send a certificate message containing no certificates.

      Unfortunately, WICED's approach is a bit different. What we observed was that the app got stuck in a call to function ssl_handshake_client_async() while processing state SSL_CLIENT_CERTIFICATE. Eventually the app crashed due to a simple watchdog, as bdide_1722426 described in this unanswered question.

       

      Sadly, I cannot dive deeper and understand why ssl_handshare_client_async fails. This is where I need some help. In order to provide some proof, below you can find attached the .pcapng files containing the traces of the aforementioned scenarios.

       

      I am really looking forward to a reply from some representative of Cypress. This has been a big issue for a long time.

       

      Thanks in advance,

      Stavros.

        • 1. Re: ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE
          axel.lin_1746341

          kafouros wrote:

           

          Hi everyone,

           

          WICED 4.1.0 cannot connect to Azure IoT Hub over TLS-based MQTT connection due to WICED's inability to establish a TLS channel.

           

          Direct connection has never worked for us since our product was based on WICED 3.5.2. That's why we have spinned off TLS proxies (NGINX), while waiting for a fix. However, it seems that neither 4.1.0 is solving the issue, so here is some investigation that may help you with debugging.

           

          We took Wireshark traces for the following scenarios:

          1) Connected MQTT.fx app directly to Azure IoT Hub (OK)

          2) Connected WICED's secure_mqtt app directly to Azure IoT Hub (Not OK)

          3) Connected WICED's secure_mqtt app through NGINX proxy (OK)

           

          After looking into the wireshark traces, we could see that there is a one major difference between our NGINX proxies and Azure IoT Hub. Along with the "Server Hello" message, Azure IoT Hub is also sending "Client Certificate Request" to the client specifying that a client cert is requested.

           

          However, the MQTT authentication in Azure IoT Hub is not based client certificates, but rather on credentials. In such a case, the client is not expected to send any cert. This is where the trace from MQTT.fx are useful. What that app did to deal with this issue was to simply answer, with no certificates. In fact, according to RFC5246:

          If no suitable certificate is available, the client MUST send a certificate message containing no certificates.

          Unfortunately, WICED's approach is a bit different. What we observed was that the app got stuck in a call to function ssl_handshake_client_async() while processing state SSL_CLIENT_CERTIFICATE. Eventually the app crashed due to a simple watchdog, as bdide_1722426 described in this unanswered question.

           

          I observe the same problem on sdk-3.7.0-7.

          It's a surprise that the app got stuck and then crashed due to watchdog if no certificate/key provided.

          The WICED's TLS library really need fix.

          1 of 1 people found this helpful
          • 2. Re: ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE
            kostis_2240836

            Very relevant question, precisely documented. Cypress must fix this!

            1 of 1 people found this helpful
            • 3. Re: ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE
              anyu

              Thanks for reporting this, please use this workaround in the meantime

               

              libraries/protocols/MQTT/mqtt_network.c

              in mqtt_network_init(), make the following change

               

              else
              {
                  wiced_tls_init_context( &socket->tls_context, NULL,  (const char*) conn->peer_cn );
              
              }
              

               

              to

               

              else
              {
                  wiced_tls_identity_t *fake_tls_identity = malloc( sizeof(wiced_tls_identity_t) );
                  if( fake_tls_identity != NULL )
                  {
                      memset( fake_tls_identity, 0, sizeof(wiced_tls_identity_t) );
                      wiced_tls_init_context( &socket->tls_context, fake_tls_identity,  (const char*) conn->peer_cn );
                  }
                  else
                  {
                      wiced_tls_init_context( &socket->tls_context, NULL,  (const char*) conn->peer_cn );
                  }
              }
              
              • 4. Re: ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE
                kafouros

                Hi @anyu! Thanks for your reply. The fix indeed works in the test app which is based on SDK 4.1.0! The device managed to connect to Azure IoT hub and I have exchanges messages.

                 

                But I have two questions:

                1) Is the dynamically allocated resource released in the wiced_tld_init_context?

                2) Is this fix applicable in SDK 3.7.0?

                • 5. Re: ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE
                  kafouros

                  Never mind about question (2). It also works on SDK 3.7.0

                   

                  Thanks!

                  • 6. Re: ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE
                    axel.lin_1746341

                    anyu wrote:

                     

                    Thanks for reporting this, please use this workaround in the meantime

                     

                    libraries/protocols/MQTT/mqtt_network.c

                    in mqtt_network_init(), make the following change

                     

                    1. else 
                    2.     wiced_tls_init_context( &socket->tls_context, NULL,  (const char*) conn->peer_cn ); 
                    3.  

                     

                    to

                     

                    1. else 
                    2.     wiced_tls_identity_t *fake_tls_identity = malloc( sizeof(wiced_tls_identity_t) ); 
                    3.     if( fake_tls_identity != NULL ) 
                    4.     { 
                    5.         memset( fake_tls_identity, 0, sizeof(wiced_tls_identity_t) ); 
                    6.         wiced_tls_init_context( &socket->tls_context, fake_tls_identity,  (const char*) conn->peer_cn ); 
                    7.     } 
                    8.     else 
                    9.     { 
                    10.         wiced_tls_init_context( &socket->tls_context, NULL,  (const char*) conn->peer_cn ); 
                    11.     } 

                    This is very strange code.

                    So you allow malloc for fake_tls_identity fails and it fall back to the same code as before?

                     

                    In additional, what if you fixed the BLE library? How can the user know it no longer needs to add the additional malloc?

                    • 7. Re: ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE
                      anyu

                      This is a workaround, not a fix. The downside is that the allocated resources do not get freed, you can implement this in your cleanup code. Thread can be updated when BESL fix gets put in SDK

                      • 8. Re: ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE
                        axel.lin_1746341

                        anyu wrote:

                         

                        This is a workaround, not a fix. The downside is that the allocated resources do not get freed, you can implement this in your cleanup code. Thread can be updated when BESL fix gets put in SDK

                        People outside of cyress team does not know *when* this workaround can be removed.

                        In additional, when other people search the issue on the forum, people usually copy-paste

                        the same code. Such "workaround" is a source of buggy/redundant code.

                        • 9. Re: ssl_handshake_client_async freeze at state SSL_CLIENT_CERTIFICATE
                          kostis_2240836

                          Is this workaround still relevant with WICED 5.2?

                          1 of 1 people found this helpful