Generally, two separate 8-bit UDB datapath blocks will not behave as a single 16-bit one, particularly if some math is being performed inside UDB.
Attached is example of the parallel input into 16-bit UDB datapath using FIFOin custom component by Bradley Budlong. It was specifically designed as fast intermediate storage for digital data, that need to be transferred to RAM by DMA (slow). In this exampe a wide control register (ControlReg32) is used to write 16-bit data into FIFOin on every Clock_STRB. Once FIFO is half-filled, the data begin transferred into RAM buffer using DMA. The ControlReg32 is a custom component, similar to standard Control Register, but with wider (up to 32-bit) bus output, it is already included into the project. FIFOin library is attached and must be unzipped and imported into Project->Dependencies.
I recommend using FIFOin component as starting example of wide parallel input to UDB, and then further modifying it as needed. From personal experience, this can be very mind-boggling and time-consuming entertainment.
Thanks for posting this response, it is an excellent example of a parallel interface, but unfortunately uses a lot of hand-written Verilog, and I'm not sure my coding skills are (yet) up to the task of modifying it.
This does highlight the gap between the easy-to-use graphical software tools, and the complexity of Verilog coding; I was hoping that some tweaks to cy_psoc_dp16 could provide a bridge between the two worlds, making PI and PO more accessible to those of us who aren't experienced Verilog coders.
I agree that FIFOin component is quite convoluted. But the essential portion from it are 2 chained datapath and parallel input. By comparing datapath section of the parallel adder from AN81256 (or your original example) and FIFOin, you can find a solution.
At the end it will look like parallel adder demo: on one clock it reads PI, on another clock it does some math. You can ignore parallel output if not needed and use DMA to readout registers.
Is there any specific math that you want to perform in the UDB?
Before closing this topic, thought I should point out the 2 bugs in my udb_test code, that cancel each other out, so it gives the appearance of working OK.
The incorrect lines in main.c are::
#define UDB_1_A0 udb_1_test__A0_REG // [8-bit access]
CyDmaTdSetConfiguration(dma_td, 2, DMA_DISABLE_TD, TD_INC_SRC_ADR|TD_INC_DST_ADR); // [incr src & dest addrs]
The definition refers to the 8-bit version of the A0 register, so all DMA accesses will be 8 bits, not 16. This would normally mean that the DMA controller transfers 2 bytes from the same location, so reads the UDB1 A0 value twice. However, I have also set TD_INC_SRC_ADR, so the source address is incrementing, and the A0 values of both UDBs are copied OK.
So this demo produces the right answer, but if you tried to DMA several words from the UDBs, the incrementing source address would fetch garbage values from adjacent UDB registers.
The corrected code is:
#define UDB_1_A0 udb_1_test__16BIT_A0_REG // 16-bit access
CyDmaTdSetConfiguration(dma_td, 2, DMA_DISABLE_TD, TD_INC_DST_ADR); // Increment dest addr only
In reading about DMA accesses, I'd ignored the bit about 8 and 16-bit spokes, thinking that the DMA configuration would sort this out. But in reality, the width of a spoke is defined by its address value alone; if you specify an 8-bit spoke address, nothing in the DMA configuration will change it to a 16-bit transfer - and believe me, I've wasted a lot of time trying, before I found the real cause.