FX3 Superspeed communication fails on link errors

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
SvLa_874621
Level 1
Level 1

Hi,

we have an FX3 application connected via 32bit GPIF to an FPGA. The FX3 firmware is based on the Slave FIFO example with some small changes

(Endpoint configuration: 2 bulk endpoints, 1 int endpoint). During our extended tests that continuously transmit megabytes of data we occasionally

see problems on the USB3 link. In this case the FX3 does not accept any data from host but from FPGA side it looks like that the FIFOs are empty.

For analysis we read the phy and link errors from USB3 controller with CyU3PUsbGetErrorCounts. On failing hosts we can see that phy and link errors

increase. Furthermore, we read the CyU3PUsbEventLog and found that when one of these two events happen the link is blocked afterwards:

#define CYU3P_USB_LOG_USB_HP_TIMEOUT    (0xACu) /* USB 3.0 link header acknowledgement timeout. */

#define CYU3P_USB_LOG_USBSS_LNKFAIL        (0xADu) /* USB 3.0 link failure. */

What could be the reason for this kind of errors?

Is there any way to prevent the USB link from getting blocked when these events happen?

Is there any way to unlock the USB link after seeing these events?

Thanks,

Sven

0 Likes
1 Solution

Hello Andre,

0xAB (CYU3P_USB_LOG_EPM_RESET) event is registered  when CYU3P_USBEP_SS_RESET_EVT is triggered.

Please refer to subsection IV of section 2.3 of the FX3_SDK_Troubleshooting Guide (in the SDK) for the reason of CYU3P_USBEP_SS_RESET_EVT and the workaround.

As per the workaround mentioned in the above document

- Register for the Endpoint event callback

CyU3PUsbRegisterEpEvtCallback (CyFxApplnEpCallback, 0x1B0, 0x04, 0x06);

- Stall the endpoints when CYU3P_USBEP_SS_RESET_EVT

Fom gpiftousb example of the SDK

    if (evtype == CYU3P_USBEP_SS_RESET_EVT)

    {

        if (epNum == CY_FX_EP_CONSUMER)

        {

            CyU3PDebugPrint (2, "Halting USB Streaming EP: %d\r\n", BulkRstCnt++);

            CyU3PUsbStall (CY_FX_EP_CONSUMER, CyTrue, CyFalse);

        }

        if (epNum == CY_FX_EP_LOOP_IN)

        {

            CyU3PDebugPrint (2, "Halting USB Loopback EP: %d\r\n", LoopRstCnt++);

            CyU3PUsbStall (CY_FX_EP_LOOP_IN, CyTrue, CyFalse);

        }

    }

- On seeing the STALL on the endpoint, the host is expected to send a CLEAR_FEATURE request.

The recommended recovery from the CYU3P_USBEP_SS_RESET_EVT  procedure is to STALL the endpoint, and then stop and restart the DMA data path when the CLEAR_FEATURE request is received from the host.

As the USB3.0 link is bad, you can try calling CyU3PUsbSetTxSwing API to set the Tx amplitude range for the USB 3.0 signals.

This API sets the Tx amplitude used by FX3 on the USB 3.0 interface. The device has only been tested to work properly under the default swing setting of 0.9V (swing value set to 90). This API is expected to be called before calling the CyU3PConnectState() API to enable USB connections. The swing value should be less than 1.28V (128). You can refer FX3APIGuide in the SDK for more details.

Please let me know if this helps.

Regards,

Rashi

Regards,
Rashi

View solution in original post

0 Likes
23 Replies
Rashi_Vatsa
Moderator
Moderator
Moderator
5 likes given 500 solutions authored 1000 replies posted

Hello Sven,

I would like to get more details on the application to debug the issue

- Please let me know when are these events seen. Is it during the transfers from FX3 to host or while host to FX3?

On failing hosts we can see that phy and link errors increase

>> Please let me know if you have tried connecting FX3 to different host PC and the events are seen only with some of the host PC and work fine with others. If yes, what are PC specification with which the USB link fails?

>> Please let me know more about the hardware setup. Is the device (FX3) is directly connected to the host or is connected via a USB hub?

>> Please share the complete debug prints to check the events logged by  CyU3PUsbEventLog API. Also, let me know what happens after these events are seen. Does the Device re enumerate?

What could be the reason for this kind of errors?

>> CYU3P_USB_LOG_USBSS_LNKFAIL error occurs when the number of link errors exceeds the threshold count.

>> CYU3P_USB_LOG_USB_HP_TIMEOUT Header Packet acknowledgment has not been received before the timer expires

Both these events occur due to a bad USB link.

Please confirm if the SDK version you are using is 1.3.4. with the build variable (project settings> c/c++ build>Build variable) of the firmware as 1_3_4.

Also, please try using a different USB cable and let me know if there are some improvements

Regards,

Rashi

Regards,
Rashi
0 Likes

Hi Rashi,

we already have tried different hosts, USB ports and cables and also connected USB3 hubs in between.

The result always looks different. Some combinations work better while other show link problems.

In the FX3 firmware sources we found one thing helped us to improve the USB link stability. There is a

function CyFx3Usb3LnkRelaxHpTimeout() that is normally called in the firmware when two link errors

have been detected. But we use this function from the beginning and could improve the overall link stability.

Are there more of these "hidden functions" that can help here?

- We use SDK version 1.3.4.

- When the link gets blocked it does not automatically re-enumerate. We added some logic that detects such

  a condition based on CyU3PUsbEventLog and trigger the re-enumeration on the device side.    

Thanks,

Sven

0 Likes
lock attach
Attachments are accessible only for community members.

Hi Sven,

Please replace the attached library at the following location:

FX3 SDK installation path\Cypress\EZ-USB FX3 SDK\1.3\fw_lib\1_3_4

Please put the debug version in the fx3_debug folder and the release version in the fx3_release version and then build the application with the new library. Please let me know if this solves the problem

Note: Please make a copy of the original library before placing the new one.

Regards,

Rashi

Regards,
Rashi
0 Likes

Hi Rashi,

thanks for the new version. We will test it.

What are the changes compared to the standard 1.3.4 firmware?

Regards,

Sven

0 Likes

Hi Sven,

The library attached in the previous post is modified to relax the PENDING_HP_TIMER value to 10us and also has some other fixes.

Please let me if the modified library solves the problem.

Regards,

Rashi

Regards,
Rashi
0 Likes

Hi Rashi,

My name is Andre. I am a colleague of Sven.

The previously attached library works better for us. It does not solve all our Problems, but we see a clear improvement.

Is it possible to increase the timeout further? Could we get the source code or the source code diff to SDK 1.3.4?

Thanks & Regards,

Andre

0 Likes

Hello Andre,

The latest USB3 specification (table 7.7) relaxes the PENDING_HP_TIMER value to 10us only.

Please let me know if CYU3P_USB_LOG_USB_HP_TIMEOUT and CYU3P_USB_LOG_USBSS_LNKFAIL is still seen and how frequent are these events seen.

Regards,

Rashi

Regards,
Rashi
0 Likes

Hi Rashi,

In our test we are constantly transferring data. I can see two errors in the log during the test:

- 0xAB (CYU3P_USB_LOG_EPM_RESET)

- 0xAF (CYU3P_USB_LOG_EP_UNDERRUN);

The errors occur after several minutes of data transfer. I've started the test four times. It failed after 3 minutes, 2:30 minutes, 15 minutes and 8 minutes. The CYU3P_USB_LOG_EP_UNDERRUN error appeared just once. Mostly I see CYU3P_USB_LOG_EPM_RESET.

It seems that the timeout error is gone now. Please advice further.

Thanks & Regards,

Andre

0 Likes

Hello Andre,

0xAB (CYU3P_USB_LOG_EPM_RESET) event is registered  when CYU3P_USBEP_SS_RESET_EVT is triggered.

Please refer to subsection IV of section 2.3 of the FX3_SDK_Troubleshooting Guide (in the SDK) for the reason of CYU3P_USBEP_SS_RESET_EVT and the workaround.

As per the workaround mentioned in the above document

- Register for the Endpoint event callback

CyU3PUsbRegisterEpEvtCallback (CyFxApplnEpCallback, 0x1B0, 0x04, 0x06);

- Stall the endpoints when CYU3P_USBEP_SS_RESET_EVT

Fom gpiftousb example of the SDK

    if (evtype == CYU3P_USBEP_SS_RESET_EVT)

    {

        if (epNum == CY_FX_EP_CONSUMER)

        {

            CyU3PDebugPrint (2, "Halting USB Streaming EP: %d\r\n", BulkRstCnt++);

            CyU3PUsbStall (CY_FX_EP_CONSUMER, CyTrue, CyFalse);

        }

        if (epNum == CY_FX_EP_LOOP_IN)

        {

            CyU3PDebugPrint (2, "Halting USB Loopback EP: %d\r\n", LoopRstCnt++);

            CyU3PUsbStall (CY_FX_EP_LOOP_IN, CyTrue, CyFalse);

        }

    }

- On seeing the STALL on the endpoint, the host is expected to send a CLEAR_FEATURE request.

The recommended recovery from the CYU3P_USBEP_SS_RESET_EVT  procedure is to STALL the endpoint, and then stop and restart the DMA data path when the CLEAR_FEATURE request is received from the host.

As the USB3.0 link is bad, you can try calling CyU3PUsbSetTxSwing API to set the Tx amplitude range for the USB 3.0 signals.

This API sets the Tx amplitude used by FX3 on the USB 3.0 interface. The device has only been tested to work properly under the default swing setting of 0.9V (swing value set to 90). This API is expected to be called before calling the CyU3PConnectState() API to enable USB connections. The swing value should be less than 1.28V (128). You can refer FX3APIGuide in the SDK for more details.

Please let me know if this helps.

Regards,

Rashi

Regards,
Rashi
0 Likes

Hi Rashi,

changing the TxSwing did not show any improvement. Is it possible to do something on the RX side?

Furthermore I've implemented the things you've suggested. I saw the CYU3P_USBEP_SS_RESET_EVT on one of our consumer endpoints and called CyU3PUsbStall(EP, CyTrue, CyFalse).

I can see that the Setup callback is called again with parameters  setupdat0 = 0x102 setupdat1 = 0x86. I am doing the things for the failing endpoint in the setup function as in the GpifToUsb example. Here is the code snippet from GpifToUsb example:

if (wIndex == CY_FX_EP_CONSUMER)

                {

                    CyU3PUsbSetEpNak (CY_FX_EP_CONSUMER, CyTrue);

                    CyU3PBusyWait (125);

                    CyU3PDmaChannelReset (&glDmaChHandle);

                    CyU3PUsbFlushEp(CY_FX_EP_CONSUMER);

                    CyU3PUsbResetEp (CY_FX_EP_CONSUMER);

                    CyU3PDmaChannelSetXfer (&glDmaChHandle, CY_FX_GPIFTOUSB_DMA_TX_SIZE);

                    CyU3PUsbStall (wIndex, CyFalse, CyTrue);

                    CyU3PUsbSetEpNak (CY_FX_EP_CONSUMER, CyFalse);

                    isHandled = CyTrue;

                    CyU3PUsbAckSetup ();

}

It seems that I now can detect such errors. But I still don't know how to prevent them or how to recover from them. The USB event log still shows 0xAB (CYU3P_USB_LOG_EPM_RESET). Our data transfer is ended as well.

What are the next steps in your opinion?

Thanks & Regards,

Andre

0 Likes

Hello Andre,

Is it possible to do something on the RX side?

>> Please let me know if the CYU3P_USBEP_SS_RESET_EVT occurs for the Consumer endpoint (IN endpoint) or Producer endpoint (OUT endpoint). The GPIF to USB example registers the events for the IN endpoints only.

It seems that I now can detect such errors. But I still don't know how to prevent them or how to recover from them. The USB event log still shows 0xAB (CYU3P_USB_LOG_EPM_RESET).

>> Using the workaround mentioned in the troubleshooting guide, it is expected to recover from the freeze condition when CYU3P_USB_LOG_EPM_RESET is seen.

>> CYU3P_USBEP_SS_RESET_EVT will be notified by the library to the firmware when there are a lot of retries happening on the USB bus as mentioned in the troubleshooting guide.

>> Is the CYU3P_USB_LOG_EPM_RESET events seen again after the recovery from the CYU3P_USBEP_SS_RESET_EVT is done?

>> How many times is CYU3P_USBEP_SS_RESET_EVT seen after CLEAR FEATURE is sent by host and Endpoint is reset and flushed.

Our data transfer is ended as well.

>> Please let me know the direction of the transfer. Also, are the transfers stopping after CYU3P_USB_LOG_EPM_RESET.

Please confirm if CyU3PUsbResetEndpointMemories API is not directly called from the firmware

Regards,

Rashi

Regards,
Rashi
0 Likes

Hi Rashi,

sorry for the late reply. Sven and I were busy and on vacation during the last two weeks.

The CYU3P_USBEP_SS_RESET_EVT occurs on a Consumer endpoint.

The workaround from the troubleshooting guide recovers from the freeze condition if I reset both endpoints, Consumer and Producer.

After restarting the data transfer, CYU3P_USBEP_SS_RESET_EVT occurs again. It is just a matter of time when this is seen during the data transfer.

Regarding our data transfer:

We are transferring data through one endpoint and sending it back to the host on a different endpoint. The transfer stops after CYU3P_USB_LOG_EPM_RESET.

I can confirm that we do not call directly CyU3PUsbResetEndpointMemories API from the firmware.

Usually USB is designed to have a fault tolerant data transfer mechanism. We can see with an USB analyzer that packages are resend in case of data errors. But in our case, with a bad link quality,  it seems that this doesn't work under special circumstances when a data packet gets lost. Is there a possibility to design the firmware that problems in the data transfer are recovered without a resending the data from driver or the device?

Regards,

Andre

0 Likes

Hello Andre,

Thank you for the update.

Is there a possibility to design the firmware that problems in the data transfer are recovered without resending the data from the driver or the device?

>> Please share the USB traces captured using the USB hardware analyzer (Lecroy) so that we can understand the issue well.

Regards,

Rashi

Regards,
Rashi
0 Likes

Hi Rashi,

we use USB tools from Ellisys instead of Lecroy. Are you also able to open these trace files.

If not then I can also provide screenshots.

Regards,

Sven

0 Likes

Hello Sven,

I have Ellisys visual USB software for viewing the .ufo files.

Please let me know if this is the software you are using. If yes, please share the traces. Or let me know the software that will be used to view the files.

Regards,

Rashi

Regards,
Rashi
0 Likes
lock attach
Attachments are accessible only for community members.

Hi Rashi,

I attached a .u30t file that can be opened with Ellisys USB 3.0 Analyzer.

At several points in the traces (time 4.560386780, time 4.599822756 ) you can see the "Link Bad" packets going from FX3 to host.

These errors can be recovered subsequently by "Recovery Link Training" and so they are nearly invisible to the user. In these cases

the link is tolerant to these special types of errors.
But sometimes we see that the FX3 completely freezes (time 6.800956650) and this can only be recovered when FX3 firmware resets the

DMA buffers. But in this case we receive an error in the device driver and on the user side. What we need is a solution that is also tolerant

to these types of errors.

- What is the difference between the recoverable and non recoverable errors?
- Why the FX3 freezes sometimes?

- Is there a way to improve the error tolerance also in these cases (e.g. by resending packets from driver or FX3 side)?

Regards,

Sven

0 Likes

Hello Sven,

From the traces, it's seen that the Link credit (from FX3) is missing for all the cases ((time 4.560386780, time 4.599822756, time 6.800956650  )where the error is seen.

But for IN transfers (time 4.560386780, time 4.599822756, time 6.800956650) the host issues In token again (retry) on failure but in the last case (time 6.800956650) the host doesn't issue another OUT token as the transfer failed dues to a missing link credit.

It seems as the link is bad, the link credits are missed out.

We can try increasing the SSTX signal amplitude using the CyU3PUsbSetTxSwing API. This API sets the Tx amplitude used by FX3 on the USB 3.0 interface. This API is expected to be called before calling the CyU3PConnectState() API to enable USB connections. Please try increasing the Tx Amplitude using this API can let me know if the Link errors are less.

Regards,

Rashi

Regards,
Rashi
0 Likes
lock attach
Attachments are accessible only for community members.

Hi Rashi,

we already tried to increase the SSTX signal amplitude using the CyU3PUsbSetTxSwing API but this did not improve the USB Link
quality in our setup. Because we have a TUSB501 re-driver sitting between FX3 and host and CyU3PUsbSetTxSwing influences only

the short path between FX3 and re-driver but not the problematic path from re-driver to host.

I have captured and uploaded some more USB issues (upstream: Link bad; downsteam: framing error, data length mismach) . Could you

please have a look and suggest further FX3 firmware changes to improve the error tolerance on a poor USB3 link.

Thanks,

Sven 

0 Likes

Hello Sven,

Please let me know if the USB Hardware Analyzer is connected between the driver and host or is it between FX3 and re driver.

From the traces snippet that you have shared

- Link_bad: For the first  OUT transfer, link bad (which means the data received by FX3 is with some errors) > The host sends RETRY > Then the LINK Good and LinkCredit are not seen.

- Framing Error: The Link Credit is not seen in the traces.

- Data_Length_mismatch: As the Link Credit and Link Good both are not sent.

Because we have a TUSB501 re-driver sitting between FX3 and host and CyU3PUsbSetTxSwing influences only the short path between FX3 and re-driver but not the problematic path from re-driver to host.

>> From your response it seems that the traces are taken between re driver and host, then there is a possibility that FX3 sends the data (Link credits/Link Good/Bad) but it doesn't reach the host due to the bad link between the driver and host. Is it possible to take the trace between FX3 and re driver to confirm if the problem is from FX3 side or not?

- Please let me know the length of the USB between FX3 to re driver and from re driver to FX3.

please have a look and suggest further FX3 firmware changes to improve the error tolerance on a poor USB3 link.

>> It is difficult to comment on this as we need to confirm if the problem is from FX3 as the re-driver connects the FX3 to the Host. As seen in the traces shared earlier, the problem is seen sometimes and the rest of the times the data is successfully sent from FX3.

Regards,

Rashi

Regards,
Rashi
0 Likes

Hi Rashi,

Please see the answers to your questions below:

>> Is it possible to take the trace between FX3 and re driver to confirm if the problem is from FX3 side or not?

We have connected the USB Analyzer between re-driver and FX3. It seems that the USB3 link is more instable

compared to the location between host and re-driver. We could capture a failing USB transaction.

Could you please have a look to the attached traces at time 46.793895180 and check if this issue could be FX3 related.    

>> Please let me know the length of the USB between FX3 to re-driver and from re driver to FX3.

The USB connection between re-driver and FX3 is about 50cm.

Thanks,

Sven

0 Likes

Hello Sven,

I have gone through the traces shared. From the traces, I observed that there are 14 failed transactions out of which 4 are IN transactions and the rest are OUT transactions.

IN transactions:

TP (Transaction Packet) with retry bit set is seen. Based on the sequence number in the TP (with RETRY) FX3 is sending back the DP (Data Packet) again. I don't see any problem with the recovery.

For the 46.793895180, after the RETRY TP is received by FX3, the endpoint is stalled and a clear feature request is sent by the host after which the sequence number is reset (and the stall is cleared by the firmware), and the transfers will start again. I

Could you please have a look to the attached traces at time 46.793895180 and check if this issue could be FX3 related.

>> I have checked the +/- 5 transactions from this time:

- The device sends the DP with the sequence number for which the RETRY bit was set.

Then the device sends a NRDY (Not Ready) and later an ERDY (Endpoint Ready).

- After receiving the ACK TP for with next seq number the firmware then stalls the endpoint as per the FX3 troubleshooting Guide workaround, on observing the STALL the host is expected to send a Clear Feature request.

- The device clears the STALL on getting the Clear Feature Request.

- The host now sens the ACK TP afresh i.e., with seq number 0 but if the device is not having data it responds with NRDY.

- It is not expected that the host sends ACK TP after receiving NRDY but in the traces, the host sends ACK TPs even after the device has sent the NRDY.

- The host is expected to wait for ERDY from the device. It seems that the host has not received the NRDY from the device.

>> Please let me know if the device recovers and continues data transfers after the clear feature or does the data transfer stop after the clear feature?

OUT transactions:

FX3 sends TP with RETRY bit send for a particular sequence number based on that the host sends DP with the same sequence number again. The OUT transactions are also working as expected.

The RETRYs are expected as the data received/sent by the device might get corrupted due to the bad link.

The USB connection between the re-driver and FX3 is about 50cm.

>> Please let me know if 50 cm is the USB trace length from FX3 to re-driver on the same board

If yes, As per AN70707 "Minimize the trace length of USB lines as much as possible (<3 inches). These should be routed first to make sure certain recommendations on this list are achievable. Long traces affect the transmitter quality and introduce intersymbol interference (ISI) on the receive side"

Regards,

Rashi

Regards,
Rashi
0 Likes

Hi Rashi,

thanks for analyzing the traces.

>> Please let me know if the device recovers and continues data transfers after the clear feature or does the data transfer stop after the clear feature?

From the traces we can see that device recovers after clear feature. There are further transfers from host and responses from the device.

This is OK for us, the FX3 does not lockup anymore when when the link is poor.

>> Please let me know if 50 cm is the USB trace length from FX3 to re-driver on the same board.

The re-driver does not sit on the same PCB. We have a cable connection to from FX3 to re-driver.

The hardware setup is as following:

FX3 <-> 1.5cm board trace <-> USB connector (TypeB) <-> 50cm USB cable <-> USB connector (Micro TypeB) <-> 0.8cm board trace <-> re-driver

Do you see a potential problem here?

Thanks,

Sven

0 Likes

Hello Sven,

Thank you for confirming that the device recovers from the errors.

FX3 <-> 1.5cm board trace <-> USB connector (TypeB) <-> 50cm USB cable <-> USB connector (Micro TypeB) <-> 0.8cm board trace <-> re-driver

>> The trace length on the PCB is in range/as per AN70707. As the USB link is bad the problem can be in the USB 3.0 signals routing or if the cable used to connect the device and re-driver is bad. Please refer to section 12.1 of the AN70707 app note https://www.cypress.com/documentation/application-notes/an70707-ez-usb-fx3-fx3s-hardware-design-guid... for the USB 3.0 routing guidelines. You can also try using a good cable.

Please let me know if any queries.

Regards,

Rashi

Regards,
Rashi
0 Likes