PSOC6 DMA block selection for SPI

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
lock attach
Attachments are accessible only for community members.
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

I am using the device CY8C6245LQI-S3D72 to test SPI function using with DMA. But I find that the DMA selection for SPI TX and RX channel is limited to DW0_CH18 and DW0_CH19, which will cause blocking due to sharing of resource at the same DW0.

tomlee_0-1628762796489.png

 

How can I set SPI TX to use DW0 and SPI RX to use DW1 in the device configurator, such that they can work in parallel without racing condition. I have included my example project for your reference.

 

0 Likes
1 Solution
lock attach
Attachments are accessible only for community members.
Hari
Moderator
Moderator
Moderator
750 replies posted 500 replies posted 250 solutions authored

Hi @tomlee 

 

We found another way to handle this. This method requires using two extra GPIOs for the purpose of triggering DW1. 

We provide the SCB (SPI) Tx trigger output to a GPIO. We then use another GPIO as an input pin and use that pin to trigger the DMA DW1. These two pins are shorted externally. 

This solves the firmware flow control issue and can ideally have the system working without any CPU intervention. We can also use the entire 256x256 bytes in this scenario. 

I am attaching a project that showcases this. You can try this in your kit by shorting pin P11.0 with P 11.4. 

 

Please try this out and let me know if this works for you and if using these 2 extra GPIOs is viable option. 

 

Also, please note that the issue that you were seeing with Rx was because the destination increment was not set correctly. It should be working now. 

 

Do let us know your thoughts. 

 

Best regards, 
Hari

View solution in original post

0 Likes
35 Replies
Hari
Moderator
Moderator
Moderator
750 replies posted 500 replies posted 250 solutions authored

Hi @tomlee 

 

It is not possible to route DW0 and DW1 to Tx and Rx respectively in this device. 

However, can you explain the issue you are facing? DMA runs significantly faster than SPI transactions so it would not cause any issue. You can also set the Tx DMA as pre-emptible and set higher priority to Rx DMA so that any receiving information is not lost.

 

Best regards, 
Hari

0 Likes
lock attach
Attachments are accessible only for community members.
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

The problem I am facing is that the SPI transfer will break when running at 25MHz speed

tomlee_0-1629100782628.png

 

I have already followed your suggestion to set Tx DMA as preemptable and set higher priority to Rx DMA. It is not helping.

tomlee_1-1629100865927.png

tomlee_2-1629100878494.png

I have included my testing project.

 

 

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

Any solution from cypress?

0 Likes
Hari
Moderator
Moderator
Moderator
750 replies posted 500 replies posted 250 solutions authored

Hi @tomlee 

 

Unfortunately, in this device, the SPI block cannot be routed to two different DMA DW blocks. It has a hardware connection only to DW0 or DW1 in all SCB blocks. 

 

Best regards, 
Hari

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

Does it mean that there is no solution for the SPI breaking issue due to hardware limitation?

25MHz SPI is critical in my usecase. If so I will reject the sale who keeps persuading me to use this device and use other MCU like ST...

0 Likes
RodolfoGL
Employee
Employee
250 solutions authored 250 sign-ins 5 comments on KBA

A few suggestions:

1) Can you setup the SPI to do 16-bit transfers, instead of 8-bit transfers? So the DMA can transfer WORD to HALFWORD or HALFWORD to WORD.

2) Are you running the CLK_SLOW at the maximum speed of 100 MHz? This is because the DMA runs based on this clock.

0 Likes

1. I have try to use WORD to HALFWORD and HALFWORD to WORD, but the SPI transfer is still breaking

2. I am already setting CLK_SLOW to use 100 MHz

0 Likes
RodolfoGL
Employee
Employee
250 solutions authored 250 sign-ins 5 comments on KBA

When doing the change (1), did it improve a bit?

One more thing you can do. Instead of triggering the input type to "One transfer per trigger", you could set to "An entire descriptor transfer per trigger". This would avoid some extra cycles for the DMA to retrigger, since you can't keep up with the bandwidth. 

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

I do not see any improvement for the change (1). It also affect my usage as I need to transfer variable length of SPI data, which can have odd length array transfer.

I changed the tx and rx descriptor to use "An entire descriptor transfer per trigger", the break is even worse. It sent a few bytes and broke for a long time for remaining transfer.

tomlee_1-1629859238072.png

 

Can you try with my testing project (SCB_SPI_Master_DMA.zip) and figure out a workable solution?  It is very inefficient for trying your solution once a day. I have been frustrated by this issue for weeks...

 

0 Likes
RodolfoGL
Employee
Employee
250 solutions authored 250 sign-ins 5 comments on KBA

OK. I will take a look at your project.

But let me ask you. Is the speed an issue (you want to send this data as fast you can without any break) or the issue is the SS that de-asserts?

If it is the later, a way to solve this is to manually assert/de-assert the SS line in firmware. So you can keep the SS low during the entire transfer, but the SPI clock might stop a few times between, as you see in your graphs.

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

Keeping the SS pin low in the process is not acceptable in my usecase, as the average speed will be degraded. I am expecting 25MHz SPI speed with no breaking, as specified in the datasheet.

0 Likes
RodolfoGL
Employee
Employee
250 solutions authored 250 sign-ins 5 comments on KBA

Note that the bottleneck here is the DMA, not the SPI. If you just need to send a frame smaller than the SPI FIFO size, you will not see any degradation on the speed.

I don't have much time to try new things, but I did look at your project. Not sure why you are using two descriptors per DMA channel. But I have one more idea.

Rather than use the SPI TX DMA hardware trigger connection, you could use one of the DW1 channels to setup the transfer between the tx_buffer to the SPI TX FIFO directly. You can trigger this DMA using software. Your SPI RX DMA can be kept as it is.

This approach would satisfy your initial request to use DW1.

lock attach
Attachments are accessible only for community members.
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

I need to send data array that is larger than the SPI FIFO size. And I use two descriptors for the flexibility of sending variable length of SPI data.

 

For your suggestion, I used DW1_CH0 and set the destination address to TX_FIFO_WR, source address set to tx_buffer. But after triggering the DMA, I do not see any data coming out to the SPI lines. I am not sure if my setting is correct. And I also wonder whether this method workable.

I have included my modified project for your reference.

0 Likes
RodolfoGL
Employee
Employee
250 solutions authored 250 sign-ins 5 comments on KBA

You need to trigger the entire descriptor chain for the new txDma. In the Trigger input type, set to "Entire descriptor chain per trigger".

You also haven't added the instruction to trigger the txDma. Remember, there is no hardware trigger anymore, so only enabling the DMA will not trigger it. You have to add this line of code:

Cy_TrigMux_SwTrigger(TRIG1_OUT_CPUSS_DW1_TR_IN0, CY_TRIGGER_TWO_CYCLES))

More info about how to use this SW trigger is here:

https://infineon.github.io/psoc6pdl/pdl_api_reference_manual/html/group__group__trigmux__functions.h...

Note that with this method, there is no flow control in hardware, which means you might see some overflow on the TX FIFO. I don't think you will see overflow if you are transferring 1500 bytes only, because the SPI runs on a very high frequency, so the DMA might still be the bottleneck.  If the DMA is no longer the bottleneck, it will still take some time to observe the overflow.

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

I changed to use "Entire descriptor chain per trigger". But adding this line will have compilation error complaining TRIG1_OUT_CPUSS_DW1_TR_IN0 not available.

Cy_TrigMux_SwTrigger(TRIG1_OUT_CPUSS_DW1_TR_IN0, CY_TRIGGER_TWO_CYCLES);

 

Then I changed to use following one, but still not data coming out from SPI line.

Cy_TrigMux_SwTrigger(TRIG_OUT_MUX_0_PDMA0_TR_IN0, CY_TRIGGER_TWO_CYCLES);

tomlee_0-1629944473658.png

Can you try my project with the development board CY8CPROTO-062S3-4343W to see whether it is really working? I am really not familiar with the DMA setting, datasheet is not so clearly described...

 

 

0 Likes
RodolfoGL
Employee
Employee
250 solutions authored 250 sign-ins 5 comments on KBA

It seems for this device, it should be:

TRIG_OUT_MUX_1_PDMA1_TR_IN0
0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

Change to use TRIG_OUT_MUX_1_PDMA1_TR_IN0 can see data coming out to SPI lines. But then I see 2 problems

1. Interrupt for SPI RX DMA is not triggerred after the transfer

2. There are data missing in the TX transfer, which I think is due to FIFO overflow. Test data should increase by 1 every time

tomlee_0-1629951331776.png

 

0 Likes
RodolfoGL
Employee
Employee
250 solutions authored 250 sign-ins 5 comments on KBA

This is good. The DMA is no longer the bottleneck. The reason the RX DMA interrupt no longer triggers is because with TX FIFO overflow, it doesn't complete the transmission of all bytes.

So the only thing we need to do now is a software flow control. You could setup your TX DMA's trigger input type to "One X loop transfer per trigger" and also setup the interrupt type to "Trigger on every X loop transfer". In the interrupt handler, you can trigger the DMA again (the same way you did in your main function). You might need to add some delay in that interrupt handler to slow down even further. You can check the TX FIFO level to see how full it is and delay proportionally.

In the meanwhile,  I got a colleague to try to implement this for you, in case you are still having problems to implement this. But I feel we are close to a satisfactory solution for you.

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

It is getting complicated to me now.

So the situation I understand for now is:

1. Hardware DMA for SPI cannot work well for 25MHz due to hardware design limitation, need to use software method to solve

2. Using software method needs to implement the driver to simulate the flow control mechanism, which looks very tricky

I think I need to request your help to implement that for me, as it is already out of my imagination. I believe it is cypress responsibility to provide a patch driver to solve this problem.

What I want to have is simple, a 25MHz SPI transfer function which can do TX and RX at the same time. Like following

spi_transfer(uint8_t *input_data, uint8_t * output_data, uint32_t length)

 

Thanks for the help.

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

After discussion with our team, we decided that if this issue cannot be solved by next Friday (3, Sept), we will give up this device and use ST MCU. Due to a tight schedule of the new product, we cannot afford more time to struggle with this driver issue coming from hardware limitation.

0 Likes
lock attach
Attachments are accessible only for community members.
Hari
Moderator
Moderator
Moderator
750 replies posted 500 replies posted 250 solutions authored

Hi @tomlee 

 

I've updated the project with the required changes and now I do not see any Tx overflow and all the data is present. Can you try this project at your end? 

Do let us know if there are any other issues that you see. 

 

Best regards, 
Hari

0 Likes
lock attach
Attachments are accessible only for community members.
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

I have tried your project and find that transferring 1501 bytes can work normally. But when I try with other length combination like 3963 bytes, it is not working again. I have included my test project for your reference. There are 6 combinations of transfer length I want to test.

0 Likes
Hari
Moderator
Moderator
Moderator
750 replies posted 500 replies posted 250 solutions authored

I believe the reason for this is because of Descriptor 1. I did not configure descriptor 1 the same way as 0 and that is still causing the overflow issue. I'll update this and provide a generic solution soon...

0 Likes
lock attach
Attachments are accessible only for community members.
Hari
Moderator
Moderator
Moderator
750 replies posted 500 replies posted 250 solutions authored

Hi @tomlee 

 

The issue here was a combination of Tx overflow and incorrect parameters set in descriptor 1's configuration. 

I am attaching the updated project. 

Changes that I did - 

1. Updated the Tx DMA's descriptor 1 such that even that triggers the X loop. 

2. Set the configuration to increase source address for descriptor 1.

3. Modified the code such that descriptor 0 also sends only 150 bytes on every X loop. Increasing this number to 256 is causing Tx buffer overflow and I tested the working reliably with 150 bytes. Note that the Y loop can have upto 256 X loops without any issues. It is only the X loop transfers that may cause an issue. 

We can fix this again in firmware, but I feel the firmware is complex as is and adding an interrupt on every transfer is not practical. So, please limit the X loop transfers to 150 bytes and you can have reliable software DMA triggers. 

 

Please try this out and let me know if you face any other issues.

 

Best regards, 
Hari

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

I cannot see the change you mentioned in the project. Are you uploading the correct project?

0 Likes
lock attach
Attachments are accessible only for community members.
Hari
Moderator
Moderator
Moderator
750 replies posted 500 replies posted 250 solutions authored

Reattaching the correct project...

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

I tried your modified project but still find problems. When I test length combination of 3963 bytes, Tx data is correct, but Rx buffer can only see 3840 bytes inside it, last 123 bytes is missing. Seems that the last descriptor of RX transfer is not effecting. I change the RX descriptor to follow TX descriptor to use 150 bytes combination. It is still not working. I am testing with MOSI and MISO pins connected together.

 

Other 2 concerns I found with this method:

1. I am expecting a descriptor can transfer 65536 bytes (256*256). But now it is limited to 38400 bytes (150*256), which will be marginal for later development.

2. During the SPI tranfer, the TX DMA interrupt needs to be triggered many times and get blocked in the interrupt (28us). This will be a concern for us as it is expected that MCU can be released for other task during DMA transfer.

 

0 Likes
Hari
Moderator
Moderator
Moderator
750 replies posted 500 replies posted 250 solutions authored

Hi @tomlee 


I had only tested the Tx portion. I will check the Rx and get back to you. 

 

1. Yes, that is a limitation that we have. The way to fix that would mean a lot more time spent on interrupts which might not be suitable.

2. This again is unavoidable with the firmware flow control that we have with the current design. 

If such large data is crucial for the application, can you try another device that supports the hardware connection of the Tx FIFO to DW1. You can try a kit such as the CY8CKIT-062-BLE that has a hardware connection to DW1. 

Hari_0-1630309009299.png

 

You can directly port the application onto this kit as well without changing from your original project. Only the device configurator would require some porting.

 

Best regards, 
Hari

0 Likes
lock attach
Attachments are accessible only for community members.
Hari
Moderator
Moderator
Moderator
750 replies posted 500 replies posted 250 solutions authored

Hi @tomlee 

 

We found another way to handle this. This method requires using two extra GPIOs for the purpose of triggering DW1. 

We provide the SCB (SPI) Tx trigger output to a GPIO. We then use another GPIO as an input pin and use that pin to trigger the DMA DW1. These two pins are shorted externally. 

This solves the firmware flow control issue and can ideally have the system working without any CPU intervention. We can also use the entire 256x256 bytes in this scenario. 

I am attaching a project that showcases this. You can try this in your kit by shorting pin P11.0 with P 11.4. 

 

Please try this out and let me know if this works for you and if using these 2 extra GPIOs is viable option. 

 

Also, please note that the issue that you were seeing with Rx was because the destination increment was not set correctly. It should be working now. 

 

Do let us know your thoughts. 

 

Best regards, 
Hari

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

This method can work for me. But I find that even if I do not connect P11.0 and P11.4 in the development board, the SPI can still transfer normally without error. I wonder if it is normal?

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

Any reply from cypress? I need the answer to move forward...

0 Likes
RodolfoGL
Employee
Employee
250 solutions authored 250 sign-ins 5 comments on KBA

If you don't connect the pins externally, the input pin will be active high (floating), which means the DMA will be triggered as soon you enable it.

So there is a chance you will see an overflow in the TX FIFO, if you don't connect the pins externally.

0 Likes

So you mean the trigger pin will be low by default and set to high when SPI transfer is needed? I have checked the pin level of P11.0 and P11.4 (shorted together) and find that it is always high, even there is no SPI transfer. Can you check if it is correct?

0 Likes
RodolfoGL
Employee
Employee
250 solutions authored 250 sign-ins 5 comments on KBA

It is high as long there is space in TX FIFO (or depending how the FIFO is set)

0 Likes
tomlee
Level 2
Level 2
10 replies posted 10 sign-ins 5 replies posted

I tested this solution and found it is robust enough for long array SPI transfer. I will recommend cypress do not hard code the DMA rout in future devices. As it is really complicated to come out with a solution.

0 Likes