How to set up shared memory between CM4 and CM0p

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
TeMa_1467596
Level 5
Level 5
5 sign-ins 5 likes given First like received

I have a project that's using a CY8C6347BZI-BLD53 on the PSoC6 CY8CKIT-062-BLE and I've got basic functionality running on both cores.  This is a port from a project that used a TI Concerto dual core processor with one core continuously taking ADC readings at 21,600 samples per second and storing them in dual 12,600 element circular buffer arrays; what happens is that the core taking the ADC readings performs calculations on every set of data readings (every 46.3 uS) and decides if an event has occurred and, if it an event has occurred, then the other core is told to capture a data set which will be 12,600 readings long and start from the point the event occurred.

In the old TI processor, I created sections in .cmd files and then defined arrays in the code for both cores that pointed to the same address (as an aside, the Concerto made you pick which core 'owned' the memory address and only the one that owned the address was able to write, the other was read only).

I've considered 3 approaches:

1.     Set up DMA so that the ADC results are stored in specific address blocks but I think this approach has 2 issues (or gaps in my understanding)

     a.     I need to perform calculations on the results as they are sampled, DMA does not seem to favor this requirement

     b.     I still need to get the data from the stored address using the other core

2.     Use the IPC example in CE216795 but this seems to involve a lot of handshaking between the 2 cores and passes 1 memory address location over from one core to the other as a pointer which seems inefficient.

3.     Try to recreate my original solution which is to create arrays for both cores that 'live' at the same memory addresses which I think can be doen using SECTION commands but I'm unsure of how to implement that.

Any ideas?  If 3 is viable, how do I create the arrays such that they live at the same address?

0 Likes
1 Solution
lock attach
Attachments are accessible only for community members.

Hi Ted,

I know of two ways to define an array in RAM that is common to both cores:

  1. Define the array in the code for one of the cores. Then pass the address of the array to the other core via IPC. This is what’s done in CE216795.
  2. Modify the linker scripts for both cores to reserve the same portion of memory in a uniquely-named memory region and section. Then in the code you can declare any number of variables, arrays, structures, etc. in that section, and access then as is normally done. I attached a simple project to show how this is done – see the .ld files. Note that in this example the two cores avoid colliding by simply accessing different portions of the array foo[]. You can see this in debug – open a memory view of foo[] and as you step through the code for one core you can see the bytes changing in the other portion of the array.

There are numerous other ways to avoid colliding, depending on your application.

Hope this helps,

Mark

View solution in original post

14 Replies
TeMa_1467596
Level 5
Level 5
5 sign-ins 5 likes given First like received

Hmmm, I'm working in PSoC Creator 4.2 and a Cypress FAE assisted me by providing a starter file that I'm working from; I don't know whether it's non-standard but it includes .ld files for both cores specifically cy8c6xx7_cm4_dual.ld and cy8c6xx7_cm0plus.ld

So can't I just add/create my own data section to these files and then specify that my array is stored in that section?  If yes, some help with the syntax and a heads up on any obvious pitfalls would be appreciated.

0 Likes

Ted,

The best option with PSoC Creator is to use Option 2 in your list.

I do not think so it is inefficient, it just takes extra steps in the beginning to exchange the memory address/location from one core to the other. You need a pointer in one core and the other core has the actual memory. On boot, you send the memory address from one core to the other and initialize the pointer in the other core. Once the pointer is initialized, you can simple use to pointer to access the memory. But do keep in mind that you need to provide proper protection for this shared memory for write access (as explained in the IPC example), as both cores will have the capability to update this memory. If you are just going to read the memory from one core, then define the actual memory in the other core and use a pointer to a constant (something like const uint32 * sharedMemory) in the core that only does reading.

Now coming to option 3, the reason why this is not possible in PSoC 6 is that it compiles and generates the code for M0+ and M4 separately and then merges them into one elf/symbols file. If you try to place a variable in the same location on both cores, it will throw an error that the areas overlap or the location is already used while combining the images. So you will not be able to build the code successfully.

The '.ld' file is the linker script for GCC and both CM4 and CM0+ will have their own linker scripts for the above mentioned reason (they are built separately). You can edit these files to change how the memory is allocated to cores and where your functions/data are placed. You can learn more about these files here - https://sourceware.org/binutils/docs/ld/Scripts.html

Let us know if you need any help with the above.

Regards,

Meenakshi Sundaram R

Meenakshi Sundaram R,

Thanks for the information.  I will study the CE216795 example and get back with you with any questions I have.

Ted

0 Likes

Meenakshi Sundaram R,

I studied the CE216795 example but I'm struggling to see how I can use this.  All it does is define one 8 bit variable that can be written to and read from by either core.  If I'm reading 21,600 16-bit numbers per second on one core and sending them all through a single byte variable that handshakes every byte, that seems crazy. And how does the receiving core know that there's a new number to get when the latest value could be the same as the previous one?  The CE216795 example seems to rely on the fact that the shared variable changes value in order for a core to react to something.

Given the power of a dual core processor such as the PSoC6, surely there has to be a way to share variables between cores as an array mapped to the same address in RAM?  Maybe this can be done by modifying CE216795 to have an array instead of a unit8_t variable eg, in main_cm0p.c we can write...

static volatile unit16_t voltage[21,600];

instead of...

static uint8_t sharedVar;

And then pass the address of the beginning of the array to the other core.

But, I'm also going to need at least 2 'channels' of communication - in my old system I did this with an array of 20 control variables which were read by the receiving core so for example ctrlVariable[0] could be the polled shared number that was set to something greater than zero to indicate the index value of interest to the receiving core.

Any thoughts?

0 Likes
MarkA_91
Employee
Employee
25 replies posted 10 replies posted 5 replies posted

Hello Ted, I developed the CE216795 and will be happy to help. A few points:

The single byte shared variable in CE216795 is meant to be an example only. The variable can be expanded to be anything - an array, a structure, even a resource such as a UART. The point of the code example is to show how to use the IPC block in PSoC 6 to create a mutex to prevent simultaneous accesses of a shared resource by the two CPUs.

That said, if you don't care about simultaneous access, you can simply define a region in SRAM and let both CPUs do unrestricted reads and writes to that region.

In addition to enabling a mutex, the IPC can be used to signal between the CPUs, for example in an "I have data" - "I got the data" scheme. Please see the IPC documentation for more information.

Regards,

Mark Ainsworth

0 Likes

Mark,

Thanks for chiming in.  I can see that the mutex can be used for signalling between the 2 cores but I am confused about how to set up variables so that both cores can access them.  I have been working with Corey Wilner of Cypress who posted this thread on my behalf ADC and DMA in PSOC6  in which Yesh suggests using of DMA.  Can you shed any light on how to define an array in RAM that is common to both cores?

0 Likes
lock attach
Attachments are accessible only for community members.

Hi Ted,

I know of two ways to define an array in RAM that is common to both cores:

  1. Define the array in the code for one of the cores. Then pass the address of the array to the other core via IPC. This is what’s done in CE216795.
  2. Modify the linker scripts for both cores to reserve the same portion of memory in a uniquely-named memory region and section. Then in the code you can declare any number of variables, arrays, structures, etc. in that section, and access then as is normally done. I attached a simple project to show how this is done – see the .ld files. Note that in this example the two cores avoid colliding by simply accessing different portions of the array foo[]. You can see this in debug – open a memory view of foo[] and as you step through the code for one core you can see the bytes changing in the other portion of the array.

There are numerous other ways to avoid colliding, depending on your application.

Hope this helps,

Mark

Mark,

I will look at this tomorrow.  It sounds like your option 2 is exactly what I was looking for.  I'll probably have some questions though I could start by asking what the rules are for access of shared addresses?  If one core is read only (enforced by code) is there a problem if one core tries to read from, at the same time the other is writing to, the same location?

0 Likes

The only hardware restriction on accessing memory (and peripheral registers) by the bus masters is arbitration by the Arm AHB bus. Please see the attached block diagram image from the datasheet. Note that there are actually five bus masters: Cortex-M4 CPU, Cortex-M0+ CPU, two DMA controllers, and a CRYPTO engine.

pastedImage_0.png

0 Likes

Mark,

Thanks so much, you correctly explained how to define shared memory which I now have working - I will mark your answer above as the correct solution but I still have a a few questions that will help me and others following in my footsteps:

  1. The project you provided (and the one given to me by a Cypress FAE) have the .ld files available in the project folders - if I were to create a new project from scratch, how do I make them appear there?
  2. Looking inside the .ld files, it appears that they are designed to provide space for many different things (e.g. SMIF), is there an application note on them?  It would be useful to have some rules to follow in this regard, for instance, your example 'steals' RAM that was available to the M0+ core but leaves the RAM assigned to the M4 core untouched (it adds the shared RAM that was 'stolen' from the M0+.  If I wanted to split the contribution between the M0+ and the M4 cores, is that OK?  I started with approx 150k of RAM assigned to the M0+ but over 50k is now assigned to shared RAM so the amount available to the M0+ is <100k; everything seems OK right now but it would be good to know for future reference.
  3. If the .ld files are 'off the shelf' for the PSoC6 processor (as I suspect), do I need to back these up in case an update overwrites my changes?
  4. I ran into what I think is a GCC Compiler limitation; I believe that the linker doesn't guarantee what order variables assigned to in memory - my original allocation was...

cm0plus

    ram               (rwx)   : ORIGIN = 0x08000000, LENGTH = 0x17AC0 (96,960 bytes)

    shrdRAM     (rwx)   : ORIGIN = 0x08017AC0, LENGTH = 0xC540   (50,496 bytes stolen from original allocation)

    flash              (rx)     : ORIGIN = 0x10000000, LENGTH = 0x80000  (524,288 bytes = same as before)

m4

    shrdRAM     (rwx)    : ORIGIN = 0x08017AC0, LENGTH = 0xC540  (area shared with cm0plus)

    ram               (rwx)   : ORIGIN = 0x08024000, LENGTH = 0x23800 (same as before)

    flash              (rx)      : ORIGIN = 0x10080000, LENGTH = 0x80000 (524,288 bytes = same as before)

     but that didn't work as hoped - I have 3 arrays as follows...

uint16 ssVoltsArray[0x3140] CY_SECTION(".shrdsection"); // defines an array of 12,608 x 16 bit integers at pre-defined RAM location

uint16 ssAmpsArray[0x3140] CY_SECTION(".shrdsection");  // defines an array of 12,608 x 16 bit integers at pre-defined RAM location

uint16 xchngArray[30] CY_SECTION(".shrdsection");       // defines an array of 30 x 16 bit integers at pre-defined RAM location

but, in one core the smaller xchngArray was allocated first but on the other is was last, even though they were listed in the same area and it didn't change if I put the xchngArray first or last in my list.  So I did what I've done before (same issue in TI's Code Composer Studio using GCC) which was to create 2 separate areas of shared memory in the .ld file as follows...

cm0plus

    ram               (rwx)   : ORIGIN = 0x08000000, LENGTH = 0x17AC0 (96,960 bytes)

    shrdRAM1   (rwx)   : ORIGIN = 0x08017AC0, LENGTH = 0xC500   (50,432 bytes stolen from original allocation)

    shrdRAM 2  (rwx)   : ORIGIN = 0x08023FC0, LENGTH = 0x40        (64 bytes stolen from original allocation)

    flash               (rx)     : ORIGIN = 0x10000000, LENGTH = 0x80000  (524,288 bytes = same as before)

m4

    shrdRAM1   (rwx)    : ORIGIN = 0x08017AC0, LENGTH = 0xC500  (area shared with cm0plus)

    shrdRAM2   (rwx)    : ORIGIN = 0x08023FC0, LENGTH = 0x40       (area shared with cm0plus)

    ram                (rwx)   : ORIGIN = 0x08024000, LENGTH = 0x23800 (same as before)

    flash               (rx)      : ORIGIN = 0x10080000, LENGTH = 0x80000 (524,288 bytes = same as before)

    .shrdsection1 (NOLOAD):

    {

        *(.shrdsection1)

    } > shrdRAM1

    .shrdsection2 (NOLOAD):

    {

        *(.shrdsection2)

    } > shrdRAM2

and then define my variables like this...

uint16 ssVoltsArray[0x3140] CY_SECTION(".shrdsection1"); // defines an array of 12,608 x 16 bit integers at pre-defined RAM location

uint16 ssAmpsArray[0x3140] CY_SECTION(".shrdsection1");  // defines an array of 12,608 x 16 bit integers at pre-defined RAM location

uint16 xchngArray[30] CY_SECTION(".shrdsection2");       // defines an array of 30 x 16 bit integers at pre-defined RAM location

And this worked with the same memory locations allocated to the same cores for the same variables.  If anyone is wondering how you know where the linker has located the variables, you have to open and look in the .map files that are created after a successful compile - they're in the project folder and somewhere (search for "shrdsection" you'll find the addresses assigned to the start of the arrays.

Unless there's a way to make the linker assign variables in a specific sequence that I'm not aware of?

0 Likes

Question on bus arbitration

I read the ARM documentation on the Function and operation of arbiter module.  It seems it implements a priority system to decide which master gets use of the bus so I think this means that the following should work:

  1. I select one of my shared locations - e.g. xchngArray[0] as a reference location, let's call it the control word.
  2. If the M0+ reads from this and sees something other than zero, it knows it has work to do
  3. The M4 gathers some data, puts it into the shared data array e.g. ssVoltsArray[xxxx]
  4. Then the M4 writes a value (say 1) into the control word
  5. The M0+ is set up to poll the control word often enough that we can ensure no data will be lost (my responsibility)
  6. When the M0+ sees > 0 in the control word, it takes action which might be to a) read an index value from say xchngArray[1] b) use that index to read from the ssVoltsArray[<indexvaluejustread>] c) write zero back to the control word

Can you see any problems with this approach?

It might be better to have a way for the M4 to trigger an interrupt on the M0+, is there a way to do this?

Thanks

0 Likes

Hi Ted,

The technique you described is one form of the "I have data" / "I got the data" communication scheme that I mentioned in a previous post. Like all methods, it has advantages and disadvantages.

It is possible for one CPU to generate an event interrupt to the other CPU, using the PSoC 6 IPC. For more information, please see the IPC driver documentation in the PDL documentation.

Regards,

Mark

MarkA_91
Employee
Employee
25 replies posted 10 replies posted 5 replies posted

One other thing, we now have two IPC code examples on www.cypress.com that you may find to be helpful:

0 Likes

Hi Ted:

  1. When you do a Build > Generate Application on a new PSoC Creator project for PSoC 6, linker script files for GCC, MDK, and IAR are automatically created, for each CPU. You can modify those files as needed. You can also create a custom linker script file; I recommend using an auto-generated script file as a template. See project Build Settings, <CPU>, Linker, Custom Linker Script.
  2. The best application note that we currently have on linker scripts is AN89610, PSoC Arm Cortex Code Optimization. It hasn't yet been updated for PSoC 6 but still has useful information on Cortex coding techniques. See Section 9 for information on linker script files.
  3. See answer #1 on using sing an auto-generated script file as a template for a separate custom file.
  4. I haven't checked, but I think C in general does not guarantee any contiguous order of placement of variables in memory - I believe that it's implementation dependent. I think you only get contiguous placement within the boundaries of elements such as arrays, structures, unions, etc. Some compilers may have special pragmas or keywords to control placement of variables; you would have to check the compiler documentation.
    Outside of C, you can use the linker script to control contiguous placement of variables by putting the variables in separate sections.

Hope this helps,

Mark