10 Replies Latest reply on Dec 4, 2019 9:40 PM by YashwantK_46

    CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher

    AlSh_4533926

      Newest versions of FX3 SDK firmware library (1.3.2 and higher) have problem with CYUSB3013: abnormally slow processing of standard EP0 control read requests (reproduced on "cyfxbulklpautoenum" example from FX3 SDK)

      CYUSB3013 with library 1.3.1 can do 8000 EP0 control read operations per second via USB2, while upgrading to 1.3.2 (and higher) reduces performance to 100 operations per second.

      The problem is specific to CYUSB3013, and it appears after CyU3PDmaChannelSetXfer() function called.

      The problem does not happen with CYUSB3014. Also it does not happen with EP0 control write operations.

      Badly need bug fix for the FX3 SDK library 1.3.4 (and higher) for CYUSB3013!

      Downgrading to old version of the library is not an option.

       

      More details on reproducing the problem:

       

      1) Use cyfxbulklpautoenum SDK embedded code example from https://www.cypress.com/documentation/code-examples/usb-superspeed-code-examples

      2) Make sure embedded code invokes CyU3PDmaChannelSetXfer() function

      3) Use external computer to issue “Control read” EP0 operations via USB2 (high speed) with the following parameters:

       

      GET_CONFIGURATION

      bmRequestType = 0x80

      bRequest = 8

      Value = 0

      Index = 0

      Length = 1

       

      4) Measure performance (operations per second)

        • 1. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
          YashwantK_46

          Hi,

           

          Can you please tell if you are performing data transfer over bulk and control endpoints parallely?

          Also, is the host application requesting the data continuously or is it requesting only after the host waits to recieve data from the previous request?

           

          Is it possible for you to share the host application and the firmware that you are currently using?

          It would help us in understanding the behaviour better and enable us to get to the probable cause.


          Regards,
          Yashwant

          • 2. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
            AlSh_4533926

            Hi Yashwant,

             

            Here are more details:

             

            1) Bulk data transfer was not used by host computer in these tests. Just GET_CONFIGURATION Control operation.

             

            2) Host application waits for completion of Control transaction, then - initiates next Control transaction. Waiting is not a problem with old version of the library: achieving 8000 control transactions per second (likely, one transaction per 125 usec USB cycle). But with new versions of the library overall performance drops to 100 transactions per second.

             

            3) Now preparing very simple program for the host computer and simple FX3 code to demonstrate the problem as isolated as possible. It will take some time. Is it OK that host computer code will be for LINUX? Using LINUX allows to isolate program from Windows driver features.

             

            Thank you,

              Alex

            • 3. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
              YashwantK_46

              Hi Alex,

               

              I am trying to reproduce and test the issue on my side.


              If you are done with the host program, can you please share it along with the frimware so that i can better test the issue?


              Regards,
              Yashwant

              • 4. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
                AlSh_4533926

                Hi Yashwant, please find below: steps to reproduce the problem on LINUX

                 

                Thank you,

                   Alex

                 

                1       FX3 firmware

                Use the following firmware image provided by Cypress in EZ-USB FX3 SDK 1.3.4 for Linux

                1. Download FX3_SDK_1.3.4_Linux.tar.gz from  https://www.cypress.com/file/424271/download
                2. Extract cyusb_linux_1.0.5.tar.gz from FX3_SDK_1.3.4_Linux.tar.gz
                3. Use fx3_images/cyfxbulklpautoenum.img

                 

                2       Host computer software

                1. LINUX computer is used as host.
                2. Using: libusb-1.0 and sudo
                3. Download fx3bug_v2.tgz via the following link:

                https://www.dropbox.com/sh/wgm1wl9wemde4le/AADdu8f_n9Vb_v1iZsAuh7CEa?dl=0

                 

                4. Extract the following files from fx3bug_v2.tgz:

                File

                Description

                1. controltest.c

                Test program to demonstrate the problem

                1. cyfxbulklpautoenum.img

                Firmware image provided by Cypress in EZ-USB FX3 SDK 1.3.4 (downloaded as described above)

                loadfx3.c

                Simple software tool to load firmware image into FX3 device (in our case - CYUSB3013)

                Makefile

                Build controltest and loadfx3 from sources

                run_tests.sh

                BASH script to build and run the test

                 

                5. Run the script: “run_tests.sh”

                3       Notes on test script

                Test script “run_tests.sh” does the following:

                1. Use “make” to build “controltest” and “loadfx3”
                2. Check functionality of “bare” CYUSB3013 (with no firmware loaded, just after power-cycle)

                a) Measure performance for control write: SetConfiguration

                b) Measure performance for control read: GetConfiguration

                 

                3. Load cyfxbulklpautoenum.img into FX3 device being tested (in our case - CYUSB3013)

                4. Check functionality of “loaded” CYUSB3013

                 

                4       Test results

                Test script was run with CYUSB3013 FX3 device, connected via USB2, with the following results:

                • “bare” FX3 device (with no firmware loaded) makes 23000 control read/write operations per second
                • FX3 device with SDK 1.3.4 firmware loaded makes less than 100 control read operations per second

                 

                $ ./run_tests.sh

                Running tests. Cycle the board power, please. Press <ENTER>

                 

                cc   -lusb-1.0  loadfx3.c -o loadfx3

                cc   -lusb-1.0  controltest.c   -o controltest

                Bare CPU, control write: SetConfiguration

                10000 operations in 0.416679 seconds: 23999.3 operations per second

                Bare CPU, control read: GetConfiguration

                10000 operations in 0.424478 seconds: 23558.3 operations per second

                 

                Load FX3 CPU with Cypress basic_examples/cyfxbulklpautoenum

                 

                Loaded CPU, control write: SetConfiguration

                10000 operations in 0.48143 seconds: 20771.5 operations per second

                Loaded CPU, control read: GetConfiguration. Wait.

                10000 operations in 113.544 seconds: 88.0715 operations per second

                • 5. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
                  AlSh_4533926

                  Hi Yashwant,

                   

                  Further investigation revealed that the problem is not specific to CYUSB3013 chip.

                  There is exactly the same problem (low performance) with original FX3 CYPRESS development kit (CYUSB3014 chip) and with "cyfxbulklpautoenum" example taken from FX3 SDK revision 1.3.4.

                  What was obscuring the situation: EP0 control read performance depends upon previous operations done with FX3.

                  If using simple straightforward sequence of actions (power cycle, load cyfxbulklpautoenum, initate GET_CONFIGURATION) then the problem manifests itself consistently: 90 control read operations per second only, instead of more than 8000 (up to 25000) op/sec.

                   

                  Updated LINUX script and updated test report document uploaded to https://www.dropbox.com/sh/wgm1wl9wemde4le/AADdu8f_n9Vb_v1iZsAuh7CEa?dl=0

                   

                  Regards,

                    Alex

                  • 6. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
                    YashwantK_46

                    Hi Alex,


                    Please refer to the following thread: Re: CYUSB3014 - How to speed up operate control endpoint IN for USB highspeed


                    Also, please refer to FX3_SDK_TroubleShooting_Guide provided with the FX3 SDK, Section 2.3, Part IV.

                    The low performance of the CyU3PUsbSendEP0Data() is because, the other IN (BULK,ISO, INTERRUPT) endpoints need to be suspended so that the the data over the control endpoint doesn't get corrupted due to premature datat fetching from the DMA channel.

                     

                    You can go through the source of the CyU3PUsbSendEP0Data() and explore it to find the process of suspending the IN endpoints and then resuming them after the EP0-IN is finished in FX3 SDK version 1.3.4.

                     

                    But in SDK version 1.3.1, you won't find the same issue as this fix was included only from SDK version 1.3.2 and higher.

                     

                    Regards,

                    Yashwant

                    • 7. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
                      AlSh_4533926

                      Hi Yashwant,

                       

                      What you are mentioning (bulk-control interference problem) makes practical usage of CYUSB3014/CYUSB3013 almost impossible.

                      When this silicon bug is scheduled to be fixed?

                       

                      Thank you,

                        Alex

                      • 8. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
                        YashwantK_46

                        Hello Alex,

                         

                        As described in FX3_SDK_TroubleShooting_Guide provided with the FX3 SDK, Section 2.3, Part IV.

                        The low performance of the CyU3PUsbSendEP0Data() is because, the other IN (BULK,ISO, INTERRUPT) endpoints need to be suspended so that the the data over the control endpoint doesn't get corrupted due to premature data fetching from the DMA channel and was included from SDK version 1.3.2 onwards.

                         

                        As a workaround you could implement the following snippet of code instead of CyU3PUsbSendEP0Data():

                                       extern CyU3PDmaChannel glUibChHandle;                    /* In channel handle for ep0 */

                         

                                            extern CyU3PReturnStatus_t                                       /* declaration of DmaChannelSendData */

                                            CyU3PDmaChannelSendData (

                                                         CyU3PDmaChannel *handle,

                                                         uint8_t         *buffer,

                                                         uint16_t         count);

                        In the vendor command that you are using, replace CyU3PUsbSendEP0Data() with the below two lines:

                         

                                       CyU3PDmaChannelSendData (&glUibChHandle, glEp0Buffer, wLength);               /* use instead of CyU3PUsbSendEP0Data*/

                                       CyU3PUsbAckSetup ();                                                                                              /*important to ack the request from host*/

                         

                        NOTE: The above workaround is a way to match the control endpoint performance to that of SDK 1.3.1 but it should be taken care in the firmware that there is no BULK-IN transaction while EP0-IN transaction is going on as it will lead to the control read data getting corrupted.

                        Also, the above workaround is only provided to you and will not be added to the future releases of SDK as there are the above mentioned problems with it. So, in future releases, you will have to take care of the workaround in the application by yourself.

                         

                        Regards,
                        Yashwant

                        • 9. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
                          AlSh_4533926

                          Hi Yashwant,

                           

                          Thank you. Yes, I understand that because of CYUSB301X silicon bug it is practically impossible to use USB 2.0 control transfer simultaneously with bulk transfer.

                           

                          So, looking for workarounds:

                          1) Is it possible for two IN bulk endpoints to work in parallel with USB 2.0?

                          2) Is DMA auto channel stopped during other DMA channel callback?

                          3) Will one DMA callback be interrupted by another ?

                           

                          Thank you,

                            Alex

                          • 10. Re: CYUSB3013 low control read performance with FX3 SDK library versions 1.3.2 and higher
                            YashwantK_46

                            Hi Alex,

                             

                            Can you please specify what you meant by two BULK-IN endpoints working in parallel?
                            How do you plan on achieving the parallel working?

                             

                            2.) DMA auto channel is not stopped during the DMA callback of another channel. DMA AUTO channel can work independently.

                             

                            3.) There is no concept of prioritizing callbacks in FX3. It's a first come first serve basis. So, the current callback with be serviced first and then only the second callback will be serviced.

                            There will no interruption in handling of a DMA callback by another DMA callback.

                             

                            Regards,

                            Yashwant