How do I optimize memory read throughput in case of parallel NOR flash Storage?
The following procedure proved useful to find the optimum system settings in many cases. Start with the flash in asynchronous mode (pNOR) and configure the memory controller settings (wait states, hold times?) so that the specifications of both flash memory device and controller are met, even if signal delays (PCB traces, additional components, etc) are taken into consideration. Verify the correctness with a logic analyzer and from a software point of view. Afterwards, you might enable page or burst mode to push the performance envelope further. In many cases, a slight frequency change of the controller clock can help better match the device limits. Once the timings have been optimized you can check the corresponding waveforms for stall situations, i.e. longer idle times on the bus. Such breaks are usually caused by data move activities inside the processor and can be optimized away in many cases. Note that, although it frees the CPU for other activities, DMA is not always the fastest approach giving the highest throughput rates. Classic PIO via memcpy() often yields better results if the CPU is available for I/O operations (e.g. during system boot).