there is just the restriction of the GPIF II interface. It is a 32 bit parallel interface which can work at about 100 MHz. So that means you can transfer 400MB/s in AUTO dma channel mode.
What you did read is e. g. if your application is a e. g. UVC, then you need to add a header and a footer to the data. In this case when you need to add data with the arm core of FX3, then you need to configure the DMA channels as manual. And then you can not reach same bandwitdth as in auto mode.
If I understand your application correct, then you put the data into the GPIF interface without CPU (arm core) intervention and then you get highest bandwitdth of FX3. Almost 400MB/s there is a little bit lesser while switching dma buffers when the FLAG is asserted.