2 Replies Latest reply on Jun 21, 2016 10:47 PM by sam.lin

    Hard Faulting on memcpy within tls_get_next_record

    cary.wong_1798781

      Turning to the community and cypherbridge for insight on yet another hard fault -- forum searches reveal no existing information on this topic.

       

      We've observed a handful of hard-to-pinpoint hard faults, and pieced together some data from stack dumps.

      Our faults seem to be coming from a memcpy call within tls_get_next_record

       

      Since the source is not available, we can only work with the assembly. The link register and program counter points these instructions at the time of fault

      tls_get_next_record:

      ... +152

      bl 0x804e580 <memcpy>

      ldrh.w r9, [sp, #12] (LR)

       

       

      memcpy:

      ... +266

      ldrb.w r4, [r1], #1 (PC)

      cmp r1, r2

      strb.w r4, [r3, #1]!

      bne.n 0x804e68a <memcpy+266>

      For all the occurrences, the devices were on WICED 3.1.2, with no particular timing pattern, or any pattern for that matter. We've recently migrated to WICED 3.5.2, and haven't observed anything since, but it's too early to draw any conclusions. Is this a known issue, and is anybody else experiencing similar faults?

       

      The SDK changelogs does mention a couple of TLS fixes and improvements, but nothing that addresses this issue in particular.

        • 1. Re: Hard Faulting on memcpy within tls_get_next_record
          cypherbridge

          Hi, we have not seen this taking place, but clearly it is occurring.  Certainly it is difficult to root cause

          until a reproducible test case can be devised. The fault address and function is now isolated

          so that is step toward a solution.

           

          As with most of the binary BESL, we can advise

          and recommend but the WICED support team ultimately handles features and bugfixes.

           

          As you may know Cypherbridge uSSL SDK is running under a WICED wrapper framework,

          and one aspect of its function is to manage memory resources.  Is it possible this issue

          relates to the content transfer?   Perhaps it was mentioned before, but can you be more specific about the session context.  Is it client, or server?  ingress certainly.  What payload or content?  HTTP, proprietary?

          Is there any way to wireshark the transfer at the peer end of the TLS connection?

           

          Regards,

          /Cypherbridge Support

          • 2. Re: Hard Faulting on memcpy within tls_get_next_record
            sam.lin

            Hi,

            I seems also hit this issue on SDK-3.1.2.

            No particular timing pattern, the device randomly reboot after running for a couple hours.

            After tracking the issue for a while, it seems hangup at below call:

            result = tls_get_next_record( context, &record, timeout, TLS_RECEIVE_PACKET_IF_NEEDED );

             

            Can someone help?