I have invested considerable amount of time in creating a DFB program intended to perform input streams' preprocessing. Because the default DFB simulator (v1.4) is next to useless, the development of this code required downgrade of PSocCreator version 4.1 to 2.2, because it is the last version that is able to run Chris Keeser's extended simulator, which at least shows you something. Cypress support didn't make my task easier either: all four of my recent bug reports have been marked as "Cancelled", despite the attached snippets which show direct violation of the DFB specification. They replied with their default "go away" message, that is they told me to
go to the community forum, as if the community members had access to the simulator sources or could adjust the misleading/strikingly wrong documentation. I don't know the reason for this hostility, but if its intended to repel customers, it surely works like a charm.
But to the point: the attached project has the "DSP" page, which contains a DFB instance together with ts program. It is fed by two DMA channels and the results are collected by another two. The exact input values are irrelevant, the problem is at the control flow level, not with incorrectly computed results. All four DMA channels work correctly, checked that with a few stage->hold forwarding snippets. This is all what main.c does: configure the testing environment and lets me see the debug signals on the scope. The code simulates well on both simulators (i.e. the DFB assembler 1.4's and Keeser's) and the obtained results are in full agreement with the C++ reference implementation. On a real chip it is an epic disaster. The DFB program is composed of two independent calculation engines, but finally they all boil down to the same task: compute a sequence of 3 4th order CIC filters, each decimating by 4. So the combined decimation factor is exactly 64. To cut the hardness by at least a half, I bypassed the 'monitoring' part, but even the much simpler SDR part is broken. The obvious sign of correctness would be the the frequency of the output DMA transfers: for 310,000 input samples per second the output should be 64 times less, i.e. 4843.75 samples per second. The scope shows 40..50kHz with no obvious pattern. Despite its name, the DFB ALU lacks any logical instructions, so the combined "to 64" counter is implemented as a packed array of 3 2-bit counters updated in a complex delta/compensator way. I've reused one of the semaphores to check how often the csb_cic_comb_integrate state is visited. Far too often. The desired scenario is as follows: enter csa_sdr_process_data 310e3 times per second, then go to csb_cic_comb_integrate every fourth cycle on each of the I/Q paths, which translates into two subsequent visits after each 8 input samples, because each path has a dedicated CIC filter. And then go further and move to the higher CIC level after every fourth of the already filtered fourth cycles, i.e. once per 16 cycles, then once per 64, then store the result in holdb. It was designed this way and this is what happens on the simulator.
I was trying to figure out what is going wrong this time with the Cypress tools, but ran out of steam. I'm at the verge of throwing the entire PSOC adventure to the dustbin and switching to the much better specified XIlinx Zynq family, but I regret all the spent time and money, so could you please have a look at the attached project and try to guess where the physical implementation of DFB diverges with its specification so strikingly that my code becomes useless?
home.cywrk.Archive01.zip 1.2 MB