We've observed the speed about 210MB/s in streamer application with P67 chipset and NEC controller.
this speed is too slow compare to superspeed 5Gbps and expected performance of FX3 even.
I wonder what is a bottleneck at this low speed among whole system components(south bridge, PCIe, NEC controller, FX3's memory architecture, DMA speed, host driver, f/w driver,...)
and is there any possibility of speed-up in official release version of driver and firmware?
In my application, the speed is very important factor, so I'd like to hear ways of speedup technique and the limit of FX3 if exists.
Currently windows does not natively support USB3 and we don't have a clear picture of the host controller performances. Once the host controller products mature and Windows start supporting USB3 natively we should be able to provide performance numbers. At this point of time 220MB/s was seen during one of our testing but we've not looked at how much more can be achieved.
A lot of mainboards equiped with an USB3 controller only have one PCI-e 1.x lane connected to the USB3 controller.
This means a maximum of about 250 MB/s for the PCI-e lane.
Some mainboards have an option in the BIOS to assign two PCI-e 1.x lanes to the USB3 controller by disabling the SATA600 controller. But that's only an option when you have a separate SATA600 controller, which a mainboard with the P67 probably has not. (SATA600 is already integrated in the P67 chipset)
You need at least one PCI-e 2.x lane or two PCI-e 1.x lanes to the USB3 controller to achieve > 250 MB/s bandwidth. (theoretical 500 MB/s)
If I'm correct, USB3 supported upto 3 simultaneous transfers with 128 MB/s guaranteed bandwidth. Probably comparable to isochronous or interrupted transfer for USB2. Following this assumption, you could try to use 3 endpoints to transfer as much data as possible.
But then again, I don't have any experience with USB3.0 yet.
We´ve also made speed tests with FX3 evalkit.
On a PC with a NEC-XHCI-Card in a PCIe 2.0 slot we dont really get more than roughly 220Mbytes/s under optimal conditions. On PCIe 1.0 ist even less...
I tried everything i could (large host buffer, many host buffers, bursting) but couldn´t get more speed.
I can´t really believe that the developers at cypress do such a design like the fx3 and have no idea what´s about performance.
It should be quite simple to find out where the bottleneck really is.
For example one could use 2 FX3-Board on one Host-Adapter and measure whats the accumulated bandwith.
I have only one, so i can´t do that...
For me speed is an important factor too...
now I tested the FX3 Board in sync slave FIFO mode (80MHz PCLK) on a Gigabyte ga-790fxta-ud5 Mainboard. And the speed is nearly 300MB/s while BULK IN streaming with 16 packets per transfer. See screenshot. I think the Renesas host controller is connected via PCIe 2.0 on this board. On my office HP Compaq 6000 pro PC it seems to be only PCIe 1.0 connection.
I think the 512 MB/s is an error.
Simply put, the USB3.0 connection operates at 5 Gbps and with 8b/10b encoding you can transfer at a maximum of 500 MBytes/s.
So perhaps the timer used in the streamer-testapplication is not that stable, or the window used to compute the transfer-rate is too short, but I think it is safe to conclude any measurement above 500 MB/s is too optimistic.
According to the USB3 standard, it should be possible to expect a transfer rate of up to 400 MB/s, so I wouldn't make design-decisions that need transfer-rates above the 400 MB/s.
Only issue that remains is that your USB3 hostcontroller should be fast enough.
@zhangsnail what USB3 host-controller (and what driver version) are you using, since you seem to be using all theoretical available bandwidth?
I've tested the Cypress FX3 with some USB controller in BULK mode (Bulk loop), is the bests result i had for the moment.
Controller ASROCK USB 3.0 (renesas) :
Programming Failed (in USB Control Center)
Controller PEU3P4 (géneric VIA) :
Bulk out : Near 90 MB/s
Bulk in : Near 160 MB/s
Controller RocketU 1144A (ASM):
Bulk out : 244 MB/s
Bulk in : 261 MB/s
I've Theses rate with my own testing program (better than the BulkLoop.exe)
Chris R : How did you use the BULK STREAM ?
Can you give me params that you used during this test ?
The computer send 256 KB packets
The same settings here. I use a fixed sync slave fifo connection only generating IN traffic. No look at FIFO flags, SLWR# always asserted. 80MHz PCLK. Same results using WinUSB driver (transferring 1GB or 2GB in one chunk, there is no need to adjust the transfer settings). Next week I will get a adapter pcb in order to connect the FX3 board to our FPGA based hardware. I will report the mesurement results. I think I can get the FX3 working like the FX2, so hopefully no changes have to be done in our WinUSB based software.