SuperSpeed communication fails due to FX3 internal noise issues?

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
lock attach
Attachments are accessible only for community members.
Anonymous
Not applicable

Hi all,

   

We have noticed that FX3 PHY errors count is in strong correlation with activity on GPIF interface. Until data is handled inside FX3, typically errors count stays 0. But if FX3 itself or external FPGA starts to output data to GPIF bus, PHY errors appear.

   

The actual errors rate depends on host chipset type, USB cable length, GPIF interface voltage, etc. Definitely the host plays significant role here - I have a PC with Intel USB chipset on motherboard where the tests can run weeks without any PHY error (if I use short cable and plug the device into right port).

   

But most important - with appearing PHY errors most probably sooner or later the SuperSpeed communication fails.

   

According our experience, also FX3 clock signal quality has significant impact on errors count. You have to keep clock traces on PCB as short as possible (do not even think to clock two chips with same clock!). Otherwise the effect is similar - PHY errors and communication failure.

   

Next is an excerpt from Cypress tech support response:

   

>>>

   

These errors are not part-part dependent, but channel, activity or noise dependent. A noisy set up will produce more of these errors compared to a quiet or less-noisy set up. More activity in the chip may lead to more noise and more of such errors. However, the layers of the communication protocol (USB) are designed to recover from such dynamic errors. IO toggling results in substrate noise. If you see there are 1336 PHY errors with 3.3V IO supply, 42 PHY errors with 1.8V IO supply. Lowering the supply voltage reduces the substrate noise.

   

<<<

   

I definitely agree with tech support. Few PHY errors can be considered natural at SuperSpeed rates. And USB protocol should recover from such errors. But my concern is that in practise USB communication with FX3 fails. Even with perfect host, if you lengthen the cable a bit so that PHY errors start to appear, also the communication starts to fail. I.e. in practise it does not recover from (FX3) PHY errors.

   

dreitz posted in "SuperSpeed interoperability with USB 3.0 controllers" topic:

   

>>>

   

We are using a LeCroy AdvisorT3 to look at the USB 3.0.  We are seeing what they call Interpacket Symbols - IPS, but we never see any bad CRCs or other data.  It only shows Unknown Packets as a problem.  It's like the FX3 starts spewing garbage.

   

When using our device and the data coming from the GPIF interface, it fails.  When we use our device and the USBBulkLoopAuto sample application, we do not see the garbage.

   

<<<

   

Taking all above into account, I start to doubt that FX3 itself fails USB communication due to its internal noise issues. 

   

I attached a test for exploring the issue with Cypress FX3 DVK Device board (CYUSB3KIT-001).

   

Test itself is quite simple - host sends 32-bit toggling (0x00000000/0xFFFFFFFF) data to FX3 and FX3 GPIF automata outputs this data to its pins, causing pins to toggle as well. 

   

FX3 software is built by modifying Cypress Synchronous Slave FIFO (slfifosync) example. The original GPIF state machine is replaced with new one and a Device Vendor Request is implemented for querying FX3 error counters. 

   

Designed GPIF state machine outputs all the host sent data to GPIF pins automatically, without any external control. Plus, it fills automatically IN pipe buffers for sending to host. As there is no external GPIF clock then the automata is modified to use FX3 internal clock. GPIF II project files are located in "FX3device\GPIF II" directory.

   

The most of FX3 source modifications are placed between EXPLORE_GPIF_NOISE defines. Source files are in FX3device subdirectory.

   

Three Windows command line utilities are supplied:

   

1) FX3USBwrite - sends toggling data to FX3 OUT bulk pipe 0x01.

   

2) FX3USBread  - can be used for reading data from FX3 IN bulk pipe 0x81.

   

3) FX3USBerrors - reads FX3 PHY and LINK errors via Control Pipe 0.

   

Utilities source codes are in relevant directories. 

   

Testing scenario:

   

1) On FX3 board, set VIO1..VIO5 to 3.3V (higher bank voltage causes more errors).

   

2) Load FX3GPIFNoise.img to FX3.

   

3) Run FX3USBerrors from command prompt with option "-clear".

   

      FX3USBerrors   -clear

   

This reads FX3 error counters and resets them to 0.

   

4) Read errors with FX3USBerrors several times without "-clear" option. Hopefully error counters stay 0.

   

5) Start sending data to FX3 by launching FX3USBwrite.

   

6) Read errors. Hopefully you will see errors appearing. Few errors per tens of minutes should guarantee USB failure. Leave FX3USBwrite running. After few minutes...days it will exit with error - USB communication has failed.

   

7) Optionally you can launch FX3USBread concurrently as well, this may increase the probability of USB failure.

   

 If there are no errors, try to lengthen the cable or plug FX3 into another USB port. For example, if your PC has USB3.0 ports also at front, try these.

   

 You can also test FX3 with quiet GPIF. For that send constant data 0 to FX3 with command

   

    FX3USBwrite -data=0

   

I expect that error counters remain 0, or at least there are significantly less errors. 

   

Note about Etron chipset/driver. Etron and FX3 just do not cooperate. USB Control Transfers fail randomly at heavy USB throughput and therefore FX3USBerrors may exit immediately with error code 31. Just retry (of course, if FX3USBwrite still runs).

   

Please, test your FX3 and host and give feedback. Especially if you have a set that works reliably with PHY errors. I have 4 different hosts and 2 Cypress FX3 DVK REV3 kits (+ several our own designed prototypes), but no one combination survives FX3 PHY errors.

   

Thanks,

   

kalev

0 Likes
1 Solution
Anonymous
Not applicable

Hi all,

   

let me update a final solution to your system which may works fine. We have worked out several different designs with FX3 and FPGA now. The FX3 is confirmed to have the problem of noise issues through tests which I guess Cypress did not register the slave fifo control signals since which may increase read latency by one spare cycle. And the FX3 is confirmed to have some problems in the firmware but they are almost fixed in the new SDK release (1.3.3). So the case is that you may update the SDK first.

   

In one of our designs, we noticed severe signal noises on all IO pins. As the VCCO is 1.8V, the witnessed noise reaches 0.6 to 3.0V which causes FX3 slave fifo state machine to die. This design uses a cheap connector with dual line SMT pins. We have solved this problem perfectly by reducing the driving capability to 1/4 the total (3/4 is the default). This design has digital and analog power suppliers splitted and the analog power supplier is able to produce a clean power with ripple less than 5mV Vp-p. The witnessed transmission error is 0 through TBs of data transfer.

   

In another design the USB will fail if the slave FIFO toggles with 0x00000000 and 0xFFFFFFFF. This issue is almost solved by switching to SDK 1.3.3 however sometimes they still become not quite stable. This design uses a single LDO for all 1.2V. This design will witness several bit errors after some GBs of data transfer.

   

In another design, the slave fifo will die but less frequently. This design attaches FPGA and FX3 together on a single 12 layer board which may be able to reduce noises. This design also use splitted power supplier which also witnesses 0 transmission error through TBs of data transfer. Further tests of using 1/4 driving capability showed great slave fifo stability improvement.

   

In another design, the bypass capacitor is not placed at the optimal position. This design is only able to run at 66MHz. If set to 100MHz, it'll die as soon as some seconds. This design uses single supplier for all 1.2V.

   

In conclusion, one want to improve FX3's performance and stability, (1) split power suppliers. For FX3, 4 different power suppliers may be optimal (1.2V VDD, 1.2V VCC, 1.8V VDD, and 1.8V CVDDQ). Today's high frequency DC-DCs (upper than 1.2MHz) can easily reduce output ripple to within 5mV Vp-p. (2) Reduce IO voltage. 1.8V is the best of all. (3) Reduce IO driving capability to 1/4 the value. This helps reduce noises introduced by the IO and transmit line. (4) For some cross board designs, a USB 3.0 redriver may be optimal to maintain signal integrity. The redriver should be placed as close to the receptacle as possible. With some TBs of data transfer the (1) to (3) have been confirmed by us, and (4) is still being verified since we only use the redriver on tiny designs.

View solution in original post

0 Likes
34 Replies
Anonymous
Not applicable

Hi Kalev,

   

 

   

It looks like you already contacted Cypress tech support team regarding your issue and now you are looking to get suggestions/comments from the other customers.

   

 

   

Thanks,

   

Sai Krishna.

0 Likes
Anonymous
Not applicable

Hi Kalev,

   

we also have some noise issues. We recognized this during EMI tests  with external power supply and burst noise (
EN 61000 4-4 ). We also used a USB3 Analyser watching the traffic between host and device (customer board) and have seen unkown packets during burst noise. Sometimes the connection could be reconnect with a new training sequence - often it could not reconnect. These could not see any problems with the same test enviorment during an USB2  connection.

   

Do you have any Problems with USB2 connection? Currently we think it could be a FX3 Problem, cause all tries to reduce the noise with filter and or common mode choke in power supply could stop this problem. Maybe the FX3 has some Problem with its USB3 tranceiver and its PLL ?

   

Question to the Support team:

   

Are there any experience with your FX3 Evaluation board and EMI Burst test with external Power Supply (USB Device self powered) ?

   

Until now, we did not start a start a support case to the tech support team.

   

regards MPL

0 Likes
Anonymous
Not applicable

Hi RSKV,

   

I created this topic after I had contacted tech support first time.

   

Yes, I contacted tech support again by reopening the case. And they promised to explore the issue more.

   

Hopefully any feedback here could help me, tech support and whole developer community.

   

Thanks,

   

kalev

0 Likes
Anonymous
Not applicable

Hi  MPL,

   

USB2 seems to work well, though I have not tested it as exhaustively as I test USB3.

   

> Maybe the FX3 has some Problem with its USB3 tranceiver and its PLL ?

   

We have found that FX3 USB3 is very sensitive to clock signal quality. We succeeded to suppress PHY errors significantly on our custom board by filtering clock chip power and shortening clock signal routes. Test results with FX3GPIFNoise.img:

   

Cypress FX3 DVK rev3 board - over 100 PHY errors/min.

   

Our custom board - about 0...1 PHY errors/min.

   

Clearly a noise on clock signal affects PHY errors count (and I guess because it affects PLL).

   

Why the USB3 communication breaks? I can only guess. Either it's HW issue, for example PLL output becomes so bad that USB3 state machine fails. Or, FX3 firmware just resets USB3 if it counts too much PHY errors. I really hope it to be a firmware issue because it seems to be in correlation with PHY errors rate. Note, according SDK version 1.2.2 FX3ReleaseNotes.pdf, similar issue was already fixed:

   

...

   

2) Fixed a USB driver issue which could cause the device to re-enumerate occasionally due to
USB 3.0 link errors that accumulate over a long period of time.

   

...

   

According my tests, USB3 may break even if there is still 0 link errors even after device re-enumeration.

   

kalev

0 Likes
Anonymous
Not applicable

 Hi,

One more observation. Similar PHY errors and USB3 break effect can be achieved just by lengthening the USB cable. Therefore it seems that there is no direct relationship between FX3 external/internal noise and USB3 break, i.e. it's not PLL issue. Could it be that disconnecting USB3 causes "unkown packets"? If yes, then it's most probably software issue that just resets/disconnects USB3 when there appears enough PHY errors.

Let's see what the tech support will find.

Thanks,

kalev

0 Likes
Anonymous
Not applicable

Any news on this topic?

   

I think we have the same problem (I will open a case later). Depending on the Host-Controller, Cable, Cable length etc. we also observe connection aborts in superspeed mode. Sometimes after a few seconds and sometimes after hours. First we searched the problem in our software and the USB-3 Cables (no problem in USB-2 mode).

   

To isolate the issue I used the bulksrcsink example from the SDK, modified it to include a sequence number in the data and wrote a testprogram to receive and check the data on the PC. When the problem occurs, the sequencenumber is not OK and/or the datatransfer stops. With shorter cables by trend the problem does not occur.

   

For us this is a big issue, because we cannot risk this kind of problems at our customers, because at the moment there is no fall back (device has to be switch off/on) and we need 3m cables because ot the size of our device.

   

I will try to monitor the PHY Error counter now...

0 Likes
Anonymous
Not applicable

Hi,

   


There is little progress. Related to this SuperSpeed breaking issue, tech support reported that maybe they have found something unexpected happening on LINK layer and they are discussing issue with design team.

   

Anyway. There are two problems involved:
1) FX3 noise sensitivity that causes PHY errors. You can increase the number of PHY errors dramatically on Cypress FX3 Kit just with outputting data onto GPIF pins.
2) Possible FX3 (or hosts) issue that breaks SuperSpeed connection when PHY errors appear.

   

According USB 3.0 standard, SuperSpeed physical layer bit error rate is expected to be less than one in 1e12 bits (according my computations, 1 error in 200 seconds). So, anyway, if you want to build reliable device, you have to design a hardware where basically no PHY errors appear. And as you see, you must build better hardware that FX3 kit.

   

Cypress is kindly agreed to review users PCB layouts, but I would appreciate if they could review FX3 kit first and tell what the users should pay attention in their designs to get less PHY errors compared to Cypress kit. I assume that Cypress has followed all their FX3 Hardware Design Guidelines but still the kit seems to be far from perfect.
 

   

Thanks,
kalev
 

0 Likes
Anonymous
Not applicable

Hi ,

   

we made some progress with noise issue problem and SuperSpeed Mode. Now we  pass the burst test with modifikation at our case. This is just a little success. It would be better, if the hardware could pass the test without a case and special shielding cause we also like to sell our board without a case.

   

We have seen other FX3 customer board without this problem. Cause of this, we made a physical layer compliance test without any issue. Currently we can use USB3 Cables up to 8 m long. We also made these test with the Cypress Evaluation Board and our customer evaluation board (Yes, we have two boards with this issue). Also no issue.

   

I think there have to be a very sensitiv pin or signal at the FX3 with a non optimal circuit and/or layout. According to kalev, i think cypress have to find this issue at there own evalboard and tell us a solution :).

   

Thanks, kalev, for your FX3FPIFnoise.zip. With our board we could not see any Problems during test with reading or writing.  The error counter stayed at zero, but ... if we do both at the same time, increasing the traffic at USB3, the counter increased.

   

Now we will contact the tech support team. Hopefully, as more customer will start a support case, as faster a solution will be provided 🙂

   

regards MPL

0 Likes
lock attach
Attachments are accessible only for community members.
Anonymous
Not applicable

Hi,

I have a bit of information to share.

Tech support said that this is host that gives up and breaks the communication. To reduce noise, tech support suggested to try using of metallic shield over the FX3 (and use more realistic data pattern).

   

It's clear that using GPIF increases noise and affects SuperSpeed communication. For users the biggest concern is probably that FX3 enumerates initially at SuperSpeed, but when the actual data transfer over GPIF starts, the communication with device may break. So, my suggestion is to check always for PHY/LINK errors, especially in your own hardware designs (NB! see my comments below about CyU3PUsbGetErrorCounts).

   

Anyway, for those who are interested in playing with FX3, I attached updated FX3GPIFNoise2.zip. The changes are:

   

1) I found that CyU3PUsbGetErrorCounts function may return incorrect error counter values, i.e. hide errors. As a work-around, I wrote my own GetErrorCounts implementation and changed a way it gets called. Unfortunately it's not possible to get 100% true counter values until Cypress does not fix the API. More comments in source.

   

2) I replaced three separate host utilities with one single FX3USBtest - easier to use. Type

   

FX3USBtest -help

   

for help on its command line options.

   

Below are screenshots from my tests with Renesas USB3.0 add-on adapter (bases on NEC D720200F1 chip):

   

1) 1m USB3.0 cable is used for connecting FX3 DVK rev 3 to host Renesas adapter:

   

C:\test>FX3USBtest
FX3USBtest version. 1.2
Press ESC for exit
00:00:28 Read/Write=196.2/108.5MB/s Errors PHY/LNK=5/0
Read from device failed (GetOverlappedResult error code=31)
A device attached to the system is not functioning.

   

2) 1.8m cable. My 1.8m cable seems to be better than 1m cable, worked 9 minutes:

   

C:\test>FX3USBtest
FX3USBtest version. 1.2
Press ESC for exit
00:09:42 Read/Write=199.2/ 99.1MB/s Errors PHY/LNK=62/0
Write to device failed (GetOverlappedResult error code=31)
A device attached to the system is not functioning.

   

3) same 1m cable + 1.8m extension, total 2.8m cable. Data pattern all zeros. Has worked for now over 16 hours without errors:

   

C:\test>FX3USBtest -data=0x0,0x0
FX3USBtest version. 1.2
Data pattern = 0x00000000,0x00000000
Press ESC for exit
16:40:51 Read/Write=200.0/109.0MB/s Errors PHY/LNK=0/0

   

Br,
kalev

0 Likes
Anonymous
Not applicable

Hi,

   

any updates on that issue????

   

thanks,

   

lumpi

0 Likes
Anonymous
Not applicable

Hi,

On FX3 kit, to reduce noise, lower the VIO voltage. Set VIO1...VIO3 to 1.8V and you will get significantly less PHY errors. You can experiment also by removing VIO1...VIO3 jumpers at all after powering up and loading the program image into FX3. This way it's possible to reduce PHY errors rate to a level that is comparable with rates when processing data FX3 internally.

But when there appear PHY errors (with rate about 10 errors per minute) then you can be sure that at some point the SuperSpeed communication will break. I think, this is characteristic to USB3.0 (I can't understand why the host should give up so easily and not retry until it succeeds?). Anyway, keep the cable as short as possible. I have achieved to build a device that works with about 1...1.8m cable (tested with hosts I have available).

br,
kalev

0 Likes
Anonymous
Not applicable

Hi All,

   

I wanted to jump in and follow this post. I have been facing a series of similar issues. Thanks kalev for the tester. We failed with error code 31.

0 Likes
Anonymous
Not applicable
        We have the same problem. We designed an FPGA/FX3 based product which makes use of the GPIF configured for 32 bit synchronous slave fifo and we see a clear correlation between the reported USB physical errors and the activity on the 32 parallel slave fifo data bus. When we use the bus between the FPGA and the GIPIF (80 MHz pclk) at his maximal bandwidth the physical errors are increasing significantly up to brake the USB 3.0 link. Joel Brenner   
0 Likes
lock attach
Attachments are accessible only for community members.
Anonymous
Not applicable

Hi,

I attached FX3USBnoise3.zip that includes sources/images updated to FX3 SDK 1.3.0.

As Cypress published API sources then I finally got a chance to fix HW/SW concurrency while updating "Error Count Register". Now the program displays the exact count of PHY/LINK errors as FX3 HW detects them.

Of course, program works also when linked with original Cypress API libraries, but error counters displayed are then just less accurate.

BTW, I fixed also CyU3PBusyWait function that was optimized out while compiling with gcc options -O2,-O3, -Os, etc. Prebuilt FX3GPIFnoise.img, provided in zip is linked with library compiled with -Os option.

Modified API library sources are provided in FX3_lib_1.3.0.118_sources_updated directory.

Br,
Kalev

 

0 Likes
Anonymous
Not applicable

 Same issue here

0 Likes
Anonymous
Not applicable

Hi all,

   

I found yesterday this post from kalev and I'am a little bit suprised there is no comment from Cypress on this problem and post from 04.03.2013 it seems there is a possible issue. In parallel, I didn't found any errata sheet or note for this device FX3, which refers to this problem.

   

With my own development board (rev 3) I can reproduce reported problems and this makes my nervous, because I have to finalize my first prototype design pretty soon and won't spend days in analyzing old know issues - first post is more then 14 month old.

   

    

   

               

   

FX3USBtest version. 1.3 - Data pattern = 0x00000000,0x00FFFFFF
00:00:15  Read/Write=143.5/143.3MB/s  Errors PHY/LNK=162/74

   

Frank

0 Likes
lock attach
Attachments are accessible only for community members.
Anonymous
Not applicable

Hi,

   

There is no change/progress related to this noise and USB communication failure issues. Cypress has not found anything (at least they reported me that everything is OK in FX3).

   

As I have already mentioned above, there are two closely related issues:
a) FX3 is sensitive to toggling its own GPIF pins, causing USB SuperSpeed Phy/Link errors.
b) Phy/Link errors cause USB SuperSpeed communication failure.
 

   

Of course, FX3 could be better and be less sensitive, but my main concern is that I do not know the reason why the communication breaks.

   

What I have noticed, sometimes the content of Bulk IN pipe packets seems to appear in EP0 control packets before the communication finally breaks - my image sends data 0xFFFFFFFF to host and the same data appears in EP0 packets. Like FX3 outputs wrong data in packet or Host just interprets packets wrong.

   

As I have no USB analyzer, then I can't explore, which side, FX3 or Host behaves wrong. It can be host side as well - some USB3.0 chipsets/drivers seem to be real crap (I have very bad experiences with Etron driver).

   

I have got a bit more background information about FX3 PHY/LNK error counter behavior from Cypress Tech support. I modified my FX3USBnoise.img to reflect this new knowledge and I updated it also to the latest FX3 SDK version 1.3.1.

   

My own short test showed that version 1.3.1 behaves as previous ones - if PHY errors appear, sooner or later the communication breaks.

br,
kalev 

0 Likes
Anonymous
Not applicable

 Hi kalev I also came into another problem, that is, when I start bulk loop, data may be corruption (some bytes are broken at receival) and a work around is to set DMA buffer size to 16 (thank godness this works and currently no obvious side effects occur). But the instability is the severe problem. I would like to ask for several improvements for the device to work longer and at last place use a USB3 phy directly (which is difficult but seem to work).

   

I read your post on may last year about replacing crystal by oscillator that would increase stability. So I have a question: if I replace my 19.200MHz crystal with oscillator, is it true that the device won't fail again? Our device is requested to run for days and frequently resetting device (wdt) is not a good idea since re-enumeration would cause at least 5 seconds. Thanks in advance. I'll go buy 19.200MHz oscillator to have a test. I'll use bushound to see what the host receives when device goes down.

0 Likes
Anonymous
Not applicable

Hi wye1102,

   

I have never seen data corruption myself (if not taking into account buggy FPGA images that do not satisfy timing requirements). But I also have not played a lot with different DMA buffer sizes/counts, so I may be just lucky I have selected the right combination (size=16*1024, count=4).

>if I replace my 19.200MHz crystal with oscillator, is it true that the device won't fail again?

According to my tests, it increases stability, but this does not mean that it will never fail. You can just use the same device with a bit longer cable, or in a bit more noisy environment, or with worse adapter card, etc.
And I have experimented only by replacing the crystal on Cypress FX3 DVK with ASDMB-19.200MHZ-LY oscillator. Can't comment other crystals/oscillators.

We also need very reliable communication - device should work several weeks/months continuously. Fortunately in most use cases there is actually no need for SuperSpeed throughput, so we decided to run our device in USB 2.0 High speed mode by default. Users are suggested to switch to SuperSpeed mode only if this is extremely needed.

As the noise level depends on GPIF bus width, then we also limited to 16-bit bus (200MB/s satisfies us). SuperSpeed communication seems to be most sensitive to noise generated when FX3 outputs data to GPIF bus, i.e. when FX3 sends data to FPGA. Fortunately we do not need high throughput in this direction at all, so we use only 8 bits of bus in this direction, keeping rest 8 bits steady 0. We are considering also using some other way (SPI, UART, I2C) to transfer data from FX3 to FPGA.
What I currently believe/hope, in practise, most users should be able to run our device in SuperSpeed mode with at least 1m cable.

br,
kalev 

0 Likes
Anonymous
Not applicable

 Hi kalev! Thanks for your reply. I found something useful:

   

https://github.com/Nuand/bladeRF/issues/231

   

github.com/Nuand/bladeRF/issues/231

   

Here's the content: Hi Ronald,

   

This issue can be avoided by properly decoupling all of the separate FX3 power domains. If you look at page 9 of the schematics ( http://nuand.com/bladerf.pdf ), you will see how each power domain has an a ferite bead and decoupling caps. I verified the BER of the GPIF and found it to be 0 bits after hours of testing on several different units. Are you seeing any anomalies in your sample captures?

   
    
   
   
    The official DVK board seems to use LDOs and has small capacitors on AVDD and RX/TX pair and just srcsink of 5 threads can easily break down the transmission (can I say immediately?). Our design uses even smaller capacitor (0201) and a buck instead of LDO so I think this could also be a possible reason (I remember I do observed less PHY/LINK errors replacing with an LDO and added several capacitors, each 10 uF). I'll post another issue and test the board with LDO and larger capacitor next day and also send one board to solder a newer chip (fabed in 2013). Hope the new chip solves the stability issue. Our data rate is about 80MB/s so USB2.0 may not be sufficient and migrate to GbE is not possible currently.   
0 Likes
Anonymous
Not applicable

 While I tried again but no effect. I replaced all power suppliers to LDOs but still fails. The chip is 1213. It's a problem that FX3 is quite not stable. I think I can only add some auto-reset mechanism.

0 Likes
Anonymous
Not applicable

Hi,

   

> https://github.com/Nuand/bladeRF/issues/231

Unfortunately they do not report with what cable lengths they tested. With very short cable (USB signals are very good) the effect does not show up.

We use LDOs and separate ferrite beads + decoupling caps for most of power domains. Exceptionally, unfortunately, power for oscillator and CVDDQ is generated by buck converter. But still, there is proper filtering with ferrite and cap. I remember I played a lot with adding/replacing decoupling caps and separating power domains from each other - nothing seemed to give noticeable positive effect (everything seems to be arlready in its best).
And as you mentioned, Cypress FX3 kit uses LDOs and it's still far from perfect.

My experience is that my prototypes of same batch are actually different - some of them work quite well over 1.8+1.8=3.6m cable, some fail even with one 1.8m cable.

Also, I have noticed that sometimes the device works quite well after enumeration over certain cable length - 0 PHY errors in several hours. But about in 1 cases of 3...10 power-ups/enumerations it starts to give errors immediately (and communication breaks in less than hour). So it's a bit random and depends on enumeration (SuperSpeed training?).

As USB errors rate is clearly in correlation with FX3 outputting data on GPIF bus, and I have not found external component(s) that could cause this, then I just have to think that this happens inside FX3 (could it be that GPIF interface and USB or clock signals use the same ground wire inside chip? Or crosstalk inside FX3?).

I have played only with CYUSB3014-BZXI chip. Could it be that some other FX3 chip is less sensitive?

About your 80MB/s data rate. This is achievable with 8-bit GPIF bus between FPGA and FX3. Compared to 32-bit bus, this should create much less noise.

   

br,

   

kalev

0 Likes
Anonymous
Not applicable

Hi all,

   

Above posts in this thread have focused mainly on FX3 noise sensitivity and how to reduce BER (Bit Error Rate).
However, I think, the main problem is that SuperSpeed communication is not reliable and in my understanding it breaks too easily.
Let's remember once again two Tech Support clauses I presented already in my first post:
1) IO toggling results in (FX3) substrate noise (so this may be the reason for PHY errors);
2) the layers of the communication protocol (USB) are designed to recover from such dynamic errors;

The main question should be, why USB protocol, in practice, does not recover?

Unfortunately I have no USB Analyzer. Has somebody explored, what actually happens when USB communication breaks? What's to blame? Host? Host chipset or driver? FX3? USB standard? Nobody?

Thanks,

   

kalev

0 Likes
Anonymous
Not applicable

 Hi in my latest tests I found something interesting. My board is very unstable. After many and many tests, I think I may found my problem. When I solder two chokes in parallel, the transmission lasts longer; however when I solder them in serial the board just does down on every starting of transmission. So I just throw up all the chokes and to my suprise the board now started to work! Although there're a few PHY/LINK errors and sometimes causes OUT endpoint to malfunctioning (this happens majorly in data transfer simutaneously at different endpoints, seems to be a crosstalk issue in PHY IC design), a RESET from software is able to bring the dead endpoint into alive.

   

The choke is 600R at 100MHz that is officially recommended. However as I read from Cypress's official document I found that the core current is 800mA at maximum and the power supplier should be able to provide that current. The choke is very thin-it may can not provide that enough power. So on transmission the voltage drops heavily and the PHY is down. I did not use a oscilloscope to check the actual core voltage. I'll check it next time. I think it may be the cause.

   

The final configuration of my board is 1.2V LDO and 1.8V LDO and I don't think a BUCK will work properly without chokes. I'll have several tests on BUCK configuration. Although the disappearance of the device is hardly elimiated, I still have to face the real world problem of OUT endpoint's hang. Currently I'll test and reset OUT endpoint when command send fails. Maybe Cypress can give us a better solution.

0 Likes
Anonymous
Not applicable

 According to latest tests the FX3 is only able to work under 1.8V VIO @ 100MHz PPORT and reduction to 75MHz is very unstable. And replacing VIO to buck converter will cause device to disappear very quickly. Promotion VIO to 2.5V will put FX3 into unresponsible state soon.

   

Under 1.2V VCore & 1.8V VIO & 100MHz PPORT with external 19.2MHz oscillator (sorry I do not have its specification about its variation) with both LDOs without chokes, the board passes test for at least 30 minutes and transmitted about 550GB toggling data with very few PHY errors (only a few hundreds) LINK error may occur but very rare. When tested under 50MHz PPORT both errors are increasing rapidly and device will fail very soon. 

   

So the final & temperary solution is to change all current boards to 2 LDOs with no choke, and PPORT run at 100MHz. I have found that FX3 tends to be unstable if PPORT is not running at 100MHz, it seems that the internal bus inside FX3 is running at 100MHz. At 1.8V VIO the FPGA is OK to work under 2.5V. Cypress should give us a minimum requirement on the chokes. 

0 Likes
Anonymous
Not applicable
        I have the same problem. When i download my firmware with external data into FX3 GPIF ,fx3 can not re-enumerate steadily.It disappear and then appear at the CYCONTROL endlessly . But it can re-enumerate steadily when the exteranal data is 0X00 or 0XFF .So i think the reason is GPIF noise you said. When i use DC-DC power souce of 1.2v ,the result will be more worse than using LDO.My usb cable is 1m or 1.5m. I think 1.2v voltage quality has impact on gpif noise.I have no good idea to improve stability of my own board   
0 Likes
Anonymous
Not applicable

 I think everyone with this problem should compare SDK versions as well.  I have a couple boards that work fine with the 1.2.3 SDK but with 1.3.1 either fail with phy errors or can't enumerate at all.  I don't know why the upgrade in the SDK makes a difference but they're otherwise working great at full usb3 speed with the 1.2.3 sdk but with 1.3.1 can only work on usb2.  There is likely some other hardware factor aside from just the SDK version but something in the SDK seems influential.

0 Likes
Anonymous
Not applicable

Hi,

I will go to describe very typical situation (about 20% of cases) I see when the communication breaks in my noise test.

Host's read from IN pipe fails with error code USBD_STATUS_XACT_ERROR in USB driver.
Typically Control Endpoint (EP0) still continues to function so that I can read for example device string descriptors.
I have noticed that if I read string, device does not return string data but instead it returns data that should be returned in IN pipe, i.e. data from GPIF bus. But what is more interesting, I specify data length=256 in my read SETUP requests (this is more than actual string length) and device returns always correct data amount matching exactly the string length I requested! Like FX3 sets all fields (endpoint number and data length) correctly in data packet header but somehow data payload in packet gets wrong. If I continue to read strings (about 100...400 times), finally I will start to get data that is very much alike to strings - not 100% correct, but at least it's clear that it's not from GPIF bus.

As I have no USB analyzer, then I can't state that this is FX3 that starts to send incorrect data. Theoretically this can be also a bug in host chipset/driver. But clearly, somewhere there are serious problems in recovering from USB protocol errors.

br,
kalev
 

0 Likes
lock attach
Attachments are accessible only for community members.
Anonymous
Not applicable

 Hi I read about one of your posts about LNK_PHY_TX_TRIM register data and I validated that register only to found out that both SDK 1.2.3 and 1.3.1 set the same thing - 0x0B569011. I dumped all of USB3 Link Controller Registers (you can reference by FX3_TRM) and the data is stored in the attachment.

   

Appearently there're differences between two versions of SDKs and I counted there're a total of 8 registers. However we can also find out Cypress hides several register definition. I'll try to write these register values into the registers to see if 1.3.1 performs better.

   

1.2.3 is much stable than 1.3.1; when using crystals 1.2.3 won't cause device fail but 1.3.1 will easily break down.

0 Likes
SaBh_294166
Level 3
Level 3
25 replies posted 10 replies posted 10 questions asked

Hi,

   

Thank you Kalev and everyone else for this extremely interesting thread. We too are interested in using the FX3 with a 32 bit bus for some projects but concerned on the same issues. Will review this thread in detail again but some questions / comments:

   

1) Any attempts to work around the cable length issue using the USB3.0 redrivers ? There are a few solutions on the market so may be worth testing.

   

2) Any attempts to use a USB3.0 active cable ? (would feature similar redrivers internal to the cable assembly)

   

3) Or perhaps an optical cable extension (however is much more costly). Very curious to know if any of these work arounds eliminate this issue.

   

We too cannot be sure of the quality of the host USB3.0 adapter so this issue is of major concern. Nevertheless, we will test and review the sample codes (by Kalev) to check the results.

   

Also, as of today, we received the Nuvation USB3.0 kit which makes use of the FX3 on one side + BeStick (Altera FPGA) on the USB stick to send/receive data between 2 PCs. Not sure of the format used to xfer data but considering the toolkit is intended to benchmark, likely it will be a 32 bit bus. We will check to see if the tools report any of the same errors. We have a lot of learning to do but will read. Will chime back as we start testing. Also had a brief chat with TotalPhase on their USB3.0 bus analyzer. As it is a major expense, would rather not jump to such a tool but will consider if it assists our development and debugging.

   

Reference of the FX3 kit by Nuvation:

   

http://www.cypress.com/?rID=64310

   

and

   

http://www.cypress.com/?rID=64819

   

and

   

http://www.cypress.com/fx3/

   

wye1102,

   

you noted a high current choke in your post above. Check out max echo in Taipei. We have met them a number of times and have used their parts for years. Excellent costs and quality.

   

http://www.maxecho.com.tw/eng/content2.htm

   

 

   

 

   

Regards

   

Kumar

0 Likes
Anonymous
Not applicable

 Hi all, Cypress upgraded new SDKs versioned 1.3.3 recently, which seems to have solved a couple of issues discussed above. I have tested it with the same data pattern (fully toggle through slave fifo for about 4TB data transfering at around 340MB/s) and noticed no new errors to appear during the transmission. Thus the stable SDK version can be safely upgrade to 1.3.3 from 1.2.3. As I inferred, the version 1.3.1 is just a Beta version which added some features like LPMDisable support which may cause extremely poor performance when connected to Intel host.

   

 

   

Cypress has revealed the possible reason for device reenumeration at high data rate, and the firmware error limitation is around 60 per second, which may be a problem in a poor signal integrity environment. And I indeed verified this issue. Our product uses a 4-wire shielded cable to connect the PCB to another connector. At first I used a easy theme to solder the PCB side. The witnessed error rate exceeds to around 50 per second, and the connection is easily to get offline and reenumerated. I revised the connection theme, and the average error rate drops to around 1~2 per second max. The device never goes offline any more. These error rates is received using SDK 1.3.3. The 1.2.3 do have no error limitation and the error counter is covered -- we can't get the actual error count -- so it can still functioning at poor signal integrity environment. The 60 error per second limitation was believed to be added since 1.3.1.

   

 

   

Btw, the testing environment is: Intel hosted USB 3.0 controller, VL812 USB 3.0 Hub, 3 meters USB cable. The first pattern test uses on-board connector with 19.2MHz oscillator, and the second signal integrity test uses off-board connector with 19.2 crystal. I'll have a test on oscillator with off-board connector to verify if oscillator is really better than crystal. However it seems CVDDQ has something to do with crystal.

0 Likes
Anonymous
Not applicable

 Hi Everybody.

   

My Test Result is dirrent result.

   

 

   

Test Board is CYUSB3KIT-001.

   

I used FX3USBnoise3 files.

   

 

   

Pre compiled image(FX3GPIFnoise.img) result.

   

Read Only Test - OK

   

Write Only Test - Fail, Write to device failed (GetOverlappedResult error code=1167)

   

Both Test - Fail, Control pipe DeviceIoControl failed (GetOverlappedResult error code=31)

   

 

   

Question #1,

   

If when I use the C ++ Application(FX3USBtest) Write Only Test that had failed,

   

But Managed C++ Application (Cypress, C ++ Streamer) operates normally. I wonder why that is .

   

I wonder what other people are .

   

 

   

Builded image test result

   

Build with SDK 1.3.3. and test proceeds with reference to the story that improve haenghaeng.

   

I can't update library source code, I just build FX3device program.

   

 

   

Read Only Test - OK

   

Write Only Test - OK, PHY error is increased but operation is working.

   

Both Test - I have any error message, But Read/Write speed zero.

   

             FX3USBtest version. 1.3

   

             Device is in USB3 Super Speed mode

   

             Device internal error counters at program start: PHY/LNK=4

   

             Program treats these values as initial zero point to conti

   

 

   

             Press ESC for exit

   

             ^C:00:04  Read/Write=  0.0/  0.0MB/s  Errors PHY/LNK=2/0

   

 

   

The problem is I'm very confused.

   

Indeed wonder whether the problem is resolved to the S / W framework.

   

I want to hear your comments about this test result.

   

 

   

Best Regards.

   

Jade, Lee

0 Likes
Anonymous
Not applicable

Hi all,

   

let me update a final solution to your system which may works fine. We have worked out several different designs with FX3 and FPGA now. The FX3 is confirmed to have the problem of noise issues through tests which I guess Cypress did not register the slave fifo control signals since which may increase read latency by one spare cycle. And the FX3 is confirmed to have some problems in the firmware but they are almost fixed in the new SDK release (1.3.3). So the case is that you may update the SDK first.

   

In one of our designs, we noticed severe signal noises on all IO pins. As the VCCO is 1.8V, the witnessed noise reaches 0.6 to 3.0V which causes FX3 slave fifo state machine to die. This design uses a cheap connector with dual line SMT pins. We have solved this problem perfectly by reducing the driving capability to 1/4 the total (3/4 is the default). This design has digital and analog power suppliers splitted and the analog power supplier is able to produce a clean power with ripple less than 5mV Vp-p. The witnessed transmission error is 0 through TBs of data transfer.

   

In another design the USB will fail if the slave FIFO toggles with 0x00000000 and 0xFFFFFFFF. This issue is almost solved by switching to SDK 1.3.3 however sometimes they still become not quite stable. This design uses a single LDO for all 1.2V. This design will witness several bit errors after some GBs of data transfer.

   

In another design, the slave fifo will die but less frequently. This design attaches FPGA and FX3 together on a single 12 layer board which may be able to reduce noises. This design also use splitted power supplier which also witnesses 0 transmission error through TBs of data transfer. Further tests of using 1/4 driving capability showed great slave fifo stability improvement.

   

In another design, the bypass capacitor is not placed at the optimal position. This design is only able to run at 66MHz. If set to 100MHz, it'll die as soon as some seconds. This design uses single supplier for all 1.2V.

   

In conclusion, one want to improve FX3's performance and stability, (1) split power suppliers. For FX3, 4 different power suppliers may be optimal (1.2V VDD, 1.2V VCC, 1.8V VDD, and 1.8V CVDDQ). Today's high frequency DC-DCs (upper than 1.2MHz) can easily reduce output ripple to within 5mV Vp-p. (2) Reduce IO voltage. 1.8V is the best of all. (3) Reduce IO driving capability to 1/4 the value. This helps reduce noises introduced by the IO and transmit line. (4) For some cross board designs, a USB 3.0 redriver may be optimal to maintain signal integrity. The redriver should be placed as close to the receptacle as possible. With some TBs of data transfer the (1) to (3) have been confirmed by us, and (4) is still being verified since we only use the redriver on tiny designs.

0 Likes

Thanks for all this information. It will greatly help on our new design. 

   

One questions, for all these design did you use the 22 ohm series resistor? I'm guessing they help limit the current spike and ground bounce.

   

Thanks!

   

WBB

0 Likes