Hard Faulting on memcpy within tls_get_next_record

CaWo_1798781 · ‎Jun 03, 2016

Turning to the community and cypherbridge for insight on yet another hard fault -- forum searches reveal no existing information on this topic.

We've observed a handful of hard-to-pinpoint hard faults, and pieced together some data from stack dumps.

Our faults seem to be coming from a memcpy call within tls_get_next_record

Since the source is not available, we can only work with the assembly. The link register and program counter points these instructions at the time of fault

tls_get_next_record:
... +152
bl 0x804e580 <memcpy>
ldrh.w r9, [sp, #12] (LR)
memcpy:
... +266
ldrb.w r4, [r1], #1 (PC)
cmp r1, r2
strb.w r4, [r3, #1]!
bne.n 0x804e68a <memcpy+266>

For all the occurrences, the devices were on WICED 3.1.2, with no particular timing pattern, or any pattern for that matter. We've recently migrated to WICED 3.5.2, and haven't observed anything since, but it's too early to draw any conclusions. Is this a known issue, and is anybody else experiencing similar faults?

The SDK changelogs does mention a couple of TLS fixes and improvements, but nothing that addresses this issue in particular.

Anonymous · ‎Jun 06, 2016

Hi, we have not seen this taking place, but clearly it is occurring. Certainly it is difficult to root cause

until a reproducible test case can be devised. The fault address and function is now isolated

so that is step toward a solution.

As with most of the binary BESL, we can advise

and recommend but the WICED support team ultimately handles features and bugfixes.

As you may know Cypherbridge uSSL SDK is running under a WICED wrapper framework,

and one aspect of its function is to manage memory resources. Is it possible this issue

relates to the content transfer? Perhaps it was mentioned before, but can you be more specific about the session context. Is it client, or server? ingress certainly. What payload or content? HTTP, proprietary?

Is there any way to wireshark the transfer at the peer end of the TLS connection?

Regards,

/Cypherbridge Support

Anonymous · ‎Jun 21, 2016

Hi,

I seems also hit this issue on SDK-3.1.2.

No particular timing pattern, the device randomly reboot after running for a couple hours.

After tracking the issue for a while, it seems hangup at below call:

result = tls_get_next_record( context, &record, timeout, TLS_RECEIVE_PACKET_IF_NEEDED );

Can someone help?

Hard Faulting on memcpy within tls_get_next_record

Re: Hard Faulting on memcpy within tls_get_next_record

Re: Hard Faulting on memcpy within tls_get_next_record