FX3 SLAVE FIFO OUT transfer stoped

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
lexu_769621
Level 1
Level 1
First like given

I am working on a project which need transfer data from pc to FPGA. I used the slave FIFO.

the configure is:

16bits GPIF bus

DMA buffer size = 16K

DMA buffer COUNT = 4

DMA mode :  AUTO

if the PFGA runs at higher speed than the usb host OUT speed, the system runs very well. when the FPGA runs at a lower speed than the USB OUT speed, the transfer process can only run for several package and then blocked.

my guess:

It seems that if the FPGA runs at lower speed than USB OUT speed, the all 4 DMA  buffers have a risk to be all filled, once all the 4 DMA buffers are all filled, the AUTO OUT process terminate and can NOT recover.

is this correct?  if this is the case, how to make the process keep on going instead of termination when all 4 DMA buffers are filled ?

0 Likes
1 Solution

Hello Rocky,

From the debug prints, I observed that there a difference between the data received and data consumed which can be a reason issue (streamer App failures) as there is DMA buffer overflow.

Data tracker: RX count: 3834, TX count: 3834.

Data tracker: RX count: 4110, TX count: 3926. // possible reason

Data tracker: RX count: 4110, TX count: 4110.

- Please try increasing the DMA buffer size to 32 K and DMA buffer count as 6. This should solve the problem.

- To confirm the problem you can also try changing the DMA channel to manual channel and check that is there commit buffer failure with error code CY_U3P_ERROR_INVALID_SEQUENCE

Status = CyU3PDmaChannelCreate(&USB2GPIF_Handle, CY_U3P_DMA_TYPE_MANUAL , &dmaConfig);

And commit the DMA buffers manually. If the DMA buffers are already full (not consumed by FPGA) and still data is being sent from the USB the CyU3PDmaChannelCommitBuffer API will fail. You can see the debug prints if this fails.

void

CyFxSlFifoPtoUDmaCallback (

        CyU3PDmaChannel   *chHandle,

        CyU3PDmaCbType_t  type,

        CyU3PDmaCBInput_t *input

        )

{

    CyU3PReturnStatus_t status = CY_U3P_SUCCESS;

    if (type == CY_U3P_DMA_CB_PROD_EVENT)

    {

        glDMARxCount++;

        status = CyU3PDmaChannelCommitBuffer (chHandle, input->buffer_p.count, 0);

        if (status != CY_U3P_SUCCESS)

        {

            CyU3PDebugPrint (4, "CyU3PDmaChannelCommitBuffer failed, Error code = %d\n", status);

        }

    }

    if (type == CY_U3P_DMA_CB_CONS_EVENT )

    {

      

            glDMATxCount++;

    }

}

Please let me know the results

Regards,

Rashi

Regards,
Rashi

View solution in original post

0 Likes
10 Replies
Rashi_Vatsa
Moderator
Moderator
Moderator
5 likes given 500 solutions authored 1000 replies posted

Hello,

If you are using one GPIF socket and one USB socket for the DMA AUTO channel, you can change the DMA channel to CY_U3P_DMA_TYPE_AUTO_SIGNAL (Auto mode with produce event signaling).  Then,  PROD events and CONS events need to be registered. whenever these events will occur, the DMA callback will be called and the PROD events and CONS event can be tracked to debug how many buffers are produced/filled and consumed/empty. You can get these values of PROD and CONS events printed in the for{} loop in the main thread entry function

dmaCfg.notification = CY_U3P_DMA_CB_PROD_EVENT | CY_U3P_DMA_CB_CONS_EVENT;

dmaCfg.cb = CyFxBulkLpDmaCallback;

CyFxBulkLpDmaCallback (

        CyU3PDmaChannel   *chHandle, /* Handle to the DMA channel. */

        CyU3PDmaCbType_t  type,      /* Callback type.             */

        CyU3PDmaCBInput_t *input)    /* Callback status.           */

{

    if (type == CY_U3P_DMA_CB_PROD_EVENT)

    {

        glDMARxCount++;

    }

    if (type == CY_U3P_DMA_CB_CONS_EVENT)

    {

       glDmaTxCount++;

    }

}

    for (;;)

    {

        CyU3PThreadSleep (1000);

        if (glIsApplnActive)

        {

            /* Print the number of buffers received so far from the USB host. */

            CyU3PDebugPrint (6, "Data tracker: RxCount: %d, TxCount: %d\r\n", glDMARxCount, glDmaTxCount);

        }

    }

If the buffers are getting filled faster than they are being consumed by USB host, you can try increasing the buffer size. As per this KBA DMA Buffer Count - KBA218339 the maximum DMA buffer memory that can be used is 224 KB  for 512 KB SRAM (CYUSB3014) and a single DMA buffer used on FX3 can have a maximum size of 65520 bytes.

Please let me know whether the DMA Flags are used i.e. DMA ready flag and DMA watermark flag fro flow control

Regards,

Rashi

Regards,
Rashi

hi, Rashi

on the FPGA end the DMA ready and DMA ready watermark are all used to perform flow control. my problem is that the host USB OUT can flow control the FPGA read flow, but the FPGA can't flow control the USB host.

for example:

if I  run FPGA  at a high main clock, the FPGA have stronger data consumtion ability than the USB host sending program do. the FPGA's real data cumsumtion speed is controled by DMA ready / watermark signal, the FPGA work for a while and rest for a while, under this condition the data OUT process runs very well.

if I run FPGA at a lower mian clock, the FPGA have a lower data consumtion ability than the USB host sending program do. athough the FPGA is 100% time working, the FPGA still can NOT follow the USB host sending program. under this condition, I think the USB host OUT sending program should be flow controled by FPGA(waitforxfer(), waiting there). but in my system, the transfer keeps for only several packages and termanate.

I guess this is because of DMA buffers are all full. DMA buffer overflow condition? It seems that the DMA buffer recycle stoped when all 4 DMA buffers are all full.

I don't know how to make the DMA buffer recycle keep on going even all DMA buffers are all full. is there any setting or call back handle of this?

my project have a dynamic FPGA data consumtion speed. the FPGA can "eat" data very fast for a while and "eat" data very slow for another while. so I need bi-direction data flow control.

BR

rocky

0 Likes

Hello,

From the description the transfers are done from USB to GPIf/FPGA. Is that right?

Can you share the DMA channel configuration settings

I guess this is because of DMA buffers are all full. DMA buffer overflow condition? It seems that the DMA buffer recycle stoped when all 4 DMA buffers are all full.

>> This can be confirmed from the suggestion in my previous response i.e. by tracking PROD and CONS event

I don't know how to make the DMA buffer recycle keep on going even all DMA buffers are all full. is there any setting or call back handle of this?

>> For matching with the USB speed, the DMA buffer size can be increased i.e. DMA buffer size = 32 KB and DMA buffer count = 3/4. So that it takes time to fill the DMA buffers meanwhile the GPIF/FPGA can consume the filled buffer

Please let me know the results after these modifications

Regards,

Rashi

Regards,
Rashi
0 Likes

hello

YES, it is USB to GPIF to FPGA. and the FPGA speed is dynamic. sometime fast sometime slow.

my code is written base on Cypress FX3 book <FX3 design by example>. the sample code name is GPIF_Example5. the sample code is downloaded from Cypress website.

following your direction, I modified the AUTO mode to AUTO SIGNAL mode, and I  added the dma call  back to monit DMA CY_U3P_DMA_CB_PROD_EVENT and CY_U3P_DMA_CB_CONS_EVENT.

the glDMARxCount and glDMATxCount are different when the process goes well(NOT blocked), and the glDMARxCount/glDMATxCount goes to same when the process blocked when the FPGA run at lower speed than USB host.

so, the reason maybe not because of the DMA buffer full, but DMA buffer empty. but I don't understand why the DMA buffer becomes empty when FPGA runs at a lower speed for a OUT transfer?

I guess if the DMA have a timeout and I didn't handle it in the firmware? BTW: the PC program have a timeout of 15sec, so the timeout does not come from PC.

the following is my code and some configure.

br

rocky

/*

* StartStopApplication.c

*

*      Author: john@usb-by-example.com

*/

#define DebugPrint    CyU3PDebugPrint

#define UART_CTS            (54)    // Also LED

#define UART_RTS            (53)    // Won't be used with UART so this gives me an extra GPIO to use

#define PUSH_BUTTON            (45)    // This example checks the PushButton and debounces it for the CPLD

#define CPLD_RESET            (27)    // Hold CPLD in RESET until I've set up and am ready to run

#define CPLD_PUSH_BUTTON    (26)    // Debounced PushButton

#define CPLD_LastRDData        (20)    // CPLD needs to tell me the last data on a READ

#define MAXCLOCKVALUE        (10)    // For CPLD board, with a -7 XC2C128 the maximum frequency with margin is 800/MAXCLOCKVALUE = 80MHz

#define DEBOUNCE_TIME        (20)    // 20 msec

// Define Event Flag values

// First 22 Events are defined in cyu3usb.h at "typedef enum CyU3PUsbEventType_t"

#define ANY_EVENT                (0xFFFFFFFF)

#define USB_EVENTS                (0x7FFFFF)

#define USER_COMMAND_AVAILABLE    (1<<23)

// Define constants for blinking Error LED

#define PWM_PERIOD             (20000000)    // Approximately 10Hz

#define PWM_THRESHOLD        ( 5000000)    // On for 25% of the time

#define STANDARD_REQUEST    (0)            // My values are not shifted

#define CLASS_REQUEST        (1)

#define VENDOR_REQUEST        (2)

#define APPLICATION_THREAD_STACK    (0x1000)

#define APPLICATION_THREAD_PRIORITY    (8)

#define USB_CONSUMER_ENDPOINT            (0x81)    /* EP 1 IN */

//#define USB_CONSUMER_ENDPOINT_SOCKET    (CY_U3P_UIB_SOCKET_CONS_1)

#define USB_PRODUCER_ENDPOINT            (0x02)    /* EP 2 OUT */    //0X01

#define USB_PRODUCER_ENDPOINT_SOCKET    (CY_U3P_UIB_SOCKET_PROD_2)  //CY_U3P_UIB_SOCKET_PROD_1

#define CPU_PRODUCER_SOCKET                (CY_U3P_CPU_SOCKET_PROD)

//#define GPIF_PRODUCER_SOCKET            (CY_U3P_PIB_SOCKET_0)

#define GPIF_CONSUMER_SOCKET            (CY_U3P_PIB_SOCKET_1)

/* Burst length in 1 KB packets. Only applicable to USB 3.0. */

#define ENDPOINT_BURST_LENGTH    (16)      //ROCKY@20200504

/* DMA buffers used by the application. */

#define DMA_BUFFER_SIZE            (16384)   //ROCKY@20200504

#define DMA_BUFFER_COUNT        (12)       //(4) rocky@20200512

/* DMA callback function to handle the produce events for U to P transfers. */

void

CyFxSlFifoUtoPDmaCallback (

        CyU3PDmaChannel   *chHandle,

        CyU3PDmaCbType_t  type,

        CyU3PDmaCBInput_t *input

        )

{

       if (type == CY_U3P_DMA_CB_PROD_EVENT)

        {

            glDMARxCount++;

        }

        if (type == CY_U3P_DMA_CB_CONS_EVENT)

        {

            glDMATxCount++;

        }

}

CyU3PReturnStatus_t StartGPIF(void)

{

    CyU3PReturnStatus_t Status;

    Status = CyU3PGpifLoad(&CyFxGpifConfig);

    DebugPrint(7, "\r\nUsing GPIF:%s", CyFxGpifConfigName);

    CheckStatus("GpifLoad", Status);

    Status = CyU3PGpifSocketConfigure(1, GPIF_CONSUMER_SOCKET, 6, CyFalse, 1);   //rocky@20200507

    CheckStatus("SetWatermark", Status);

    Status = CyU3PGpifSMStart(START, ALPHA_START);

    return Status;

}

const uint16_t EpSize[] = { 0, 64, 512, 1024 };

void StartApplication(void)

// USB has been enumerated, time to start the application running

{

    CyU3PEpConfig_t epConfig;

    CyU3PDmaChannelConfig_t dmaConfig;

    CyU3PReturnStatus_t Status;

    CyU3PPibClock_t pibClock;

    CyU3PUSBSpeed_t usbSpeed = CyU3PUsbGetSpeed();

    // Display the enumerated device bus speed

    DebugPrint(4, "\r\n@StartApplication, running at %sSpeed", BusSpeed[usbSpeed]);

    // Start GPIF clocks, they need to be running before we attach a DMA channel to GPIF

    pibClock.clkDiv = 5;                //5 : 80MHz pib CLOCK

    pibClock.clkSrc = CY_U3P_SYS_CLK;

    pibClock.isHalfDiv = CyFalse;//(ClockValue & 1);  //rocky@20200420

    pibClock.isDllEnable = CyFalse;        // Disable Dll since this application is synchronous

    Status = CyU3PPibInit(CyTrue, &pibClock);

    // Based on the Bus Speed configure the endpoint packet size

    CyU3PMemSet((uint8_t *)&epConfig, 0, sizeof(epConfig));

    epConfig.enable = CyTrue;

    epConfig.epType = CY_U3P_USB_EP_BULK;

    epConfig.burstLen = (usbSpeed == CY_U3P_SUPER_SPEED) ? (ENDPOINT_BURST_LENGTH) : 1;

    epConfig.pcktSize = EpSize[usbSpeed];    //1024 for usb 3.0

    // Setup and flush the producer endpoint

    Status = CyU3PSetEpConfig(USB_PRODUCER_ENDPOINT, &epConfig);

    CheckStatus("Setup USB_PRODUCER_ENDPOINT", Status);

    //ROCKY@20200512

    // Create a AUTO SIGNAL channel for the USB to GPIF transfer

    CyU3PMemSet((uint8_t *)&dmaConfig, 0, sizeof(dmaConfig));

    dmaConfig.size           = DMA_BUFFER_SIZE;            // Use same size buffers for all USB Speeds

    dmaConfig.count          = DMA_BUFFER_COUNT;

    dmaConfig.prodSckId         = USB_PRODUCER_ENDPOINT_SOCKET;

    dmaConfig.consSckId         = GPIF_CONSUMER_SOCKET;

    dmaConfig.dmaMode        = CY_U3P_DMA_MODE_BYTE;

    /* Enabling the callback for produce event. U */

    dmaConfig.notification = CY_U3P_DMA_CB_PROD_EVENT | CY_U3P_DMA_CB_CONS_EVENT;  //CY_U3P_DMA_CB_PROD_EVENT;

    dmaConfig.cb = CyFxSlFifoUtoPDmaCallback;

    dmaConfig.prodHeader = 0;

    dmaConfig.prodFooter = 0;

    dmaConfig.consHeader = 0;

    dmaConfig.prodAvailCount = 0;

    Status = CyU3PDmaChannelCreate(&USB2GPIF_Handle, CY_U3P_DMA_TYPE_AUTO_SIGNAL , &dmaConfig);

    CheckStatus("GPIF2USB DmaChannelCreate", Status);

    Status = CyU3PUsbFlushEp(USB_PRODUCER_ENDPOINT);

    CheckStatus("Flush USB_PRODUCER_ENDPOINT", Status);

    // Start the DMA Channel with transfer size to Infinite

    Status = CyU3PDmaChannelSetXfer(&USB2GPIF_Handle, 0);

    CheckStatus("USB2GPIF DmaChannelStart", Status);

    //set count = 0;

    glDMARxCount=0;

    glDMATxCount=0;

    // Load, configure and start the GPIF state machine

    Status = StartGPIF();

    CheckStatus("GpifStart", Status);

    // OK, Application can now run

    glIsApplicationActive = CyTrue;

}

0 Likes

Hello,

glDMARxCount/glDMATxCount goes to same when the process blocked when the FPGA runs at lower speed than USB host.

so, the reason may be not because of the DMA buffer full, but DMA buffer empty.

>> Yes, you are right. The CONS events will be generated only when the buffer is consumed fully

but DMA buffer empty. but I don't understand why the DMA buffer becomes empty when FPGA runs at a lower speed for an OUT transfer?

>> Please confirm that the data is sent from USB continuously. Let me know which host application are you using to send data through USB. Is it possible to reproduce the same using Streamer application? If you are using a custom host application please explain its working.

Please probe the USB lines using Wireshark Analyser and share the traces when the issue occurs and when the firmware works fine.

Regards,

Rashi

Regards,
Rashi
0 Likes

hello

I am using streamer for test now.

I found that:

1. when the FPGA run at low frequency and the FPGA have a low data consum ability. at this situation, if I set the host streamer app setting lower, the process will not block, if I set the streamer setting higher the process will be blocked, for example:

a. packats per Xfer=4, Xfer to Queue =1, this will not block, no Failures;

b. packats per Xfer=8, Xfer to Queue=1, this will block and Failures happened

pc app streamer timeout = 15000ms, It is 15sec, the fpga do not sleep that long time.

2. when the FPGA run at higher frequencey and the FPGA have a high data consum ability, at this situation, what ever I set the streamer app, the process will not be blocked. for example:

packats per Xfer=1,2,4,8,16,32,64,128,256,512,  Xfer toQueue =1,2,4,8,16,32,64

all ok, not blocked, the only different is a lower setting, the speed is lower, at higher setting the speed is higher.

what happened if I run FPGA at a lower frenquecy? what lead the streamer app to generate time out Failures at app high settings with FPGA low speed data consumption?

I wish the streamer automaticly slow down with an app high setting when FPGA slow down. instead of the App generate Faiures.

br

rocky

0 Likes

Hello Rocky,

Please share the debug prints which shows the number of PROD and CONS events. Share the prints for both the cases

a. packets per Xfer=4, Xfer to Queue =1, this will not block, no Failures;

b. packets per Xfer=8, Xfer to Queue=1, this will block and Failures happened

Regards,

Rashi

Regards,
Rashi
0 Likes

hello Rashi

so sad on this debug. still can NOT solve the time out problem if the GPIF blocked.

my current code is one-to-one structure. will it help to use multiple sockets or threads?

br

rocky

the debug prints is below:

Data tracker: RX count: 1820, TX count: 1820.

Data tracker: RX count: 1824, TX count: 1824.

Data tracker: RX count: 1824, TX count: 1824.

Data tracker: RX count: 1824, TX count: 1824.

Data tracker: RX count: 2148, TX count: 2148.

Data tracker: RX count: 2412, TX count: 2228.

Data tracker: RX count: 2422, TX count: 2422.

Data tracker: RX count: 2696, TX count: 2696.

Data tracker: RX count: 2696, TX count: 2696.

Data tracker: RX count: 2974, TX count: 2974.

Data tracker: RX count: 3258, TX count: 3076.

Data tracker: RX count: 3276, TX count: 3276.

Data tracker: RX count: 3560, TX count: 3554.

Data tracker: RX count: 3560, TX count: 3560.

Data tracker: RX count: 3834, TX count: 3834.

Data tracker: RX count: 4110, TX count: 3926.

Data tracker: RX count: 4110, TX count: 4110.

Data tracker: RX count: 4111, TX count: 4111.

0 Likes

Hello,

Please try calling LPMDisable API before the application starts and let me know the results

lpmdisable.png

my current code is one-to-one structure. will it help to use multiple sockets or threads?

>> From the debug prints it seems that there are no producer events seen. Please let me know what is the type of variable that you have declared for tracking PROD and CONS event. Please confirm they are declared as

uint32_t glDMARxCount = 0;             

uint32_t glDMATxCount = 0;

Regrads,

Rashi

Regards,
Rashi
0 Likes

Hello Rocky,

From the debug prints, I observed that there a difference between the data received and data consumed which can be a reason issue (streamer App failures) as there is DMA buffer overflow.

Data tracker: RX count: 3834, TX count: 3834.

Data tracker: RX count: 4110, TX count: 3926. // possible reason

Data tracker: RX count: 4110, TX count: 4110.

- Please try increasing the DMA buffer size to 32 K and DMA buffer count as 6. This should solve the problem.

- To confirm the problem you can also try changing the DMA channel to manual channel and check that is there commit buffer failure with error code CY_U3P_ERROR_INVALID_SEQUENCE

Status = CyU3PDmaChannelCreate(&USB2GPIF_Handle, CY_U3P_DMA_TYPE_MANUAL , &dmaConfig);

And commit the DMA buffers manually. If the DMA buffers are already full (not consumed by FPGA) and still data is being sent from the USB the CyU3PDmaChannelCommitBuffer API will fail. You can see the debug prints if this fails.

void

CyFxSlFifoPtoUDmaCallback (

        CyU3PDmaChannel   *chHandle,

        CyU3PDmaCbType_t  type,

        CyU3PDmaCBInput_t *input

        )

{

    CyU3PReturnStatus_t status = CY_U3P_SUCCESS;

    if (type == CY_U3P_DMA_CB_PROD_EVENT)

    {

        glDMARxCount++;

        status = CyU3PDmaChannelCommitBuffer (chHandle, input->buffer_p.count, 0);

        if (status != CY_U3P_SUCCESS)

        {

            CyU3PDebugPrint (4, "CyU3PDmaChannelCommitBuffer failed, Error code = %d\n", status);

        }

    }

    if (type == CY_U3P_DMA_CB_CONS_EVENT )

    {

      

            glDMATxCount++;

    }

}

Please let me know the results

Regards,

Rashi

Regards,
Rashi
0 Likes