The GpifToUsb example is mainly used for testing the maximum throughput. For doing reliable data transfers, this example cannot be used as such.
As you may be aware the data from the GPIF is received by FX3 and stored in its DMA Buffers before sending it to the USB.
Assume that your example has 4 DMA Buffers, each of size 16K Bytes. Once the data transfer starts, the bytes from GPIF Start filling the 1st Buffer. Once the 1st buffer is filled, FX3's DMA engine takes some finite time (few uS) to make the second buffer ready. In that meantime the data coming from the GPIF will be lost.
So the bytes you have reported to be lost may would have arrived in this buffer switching time.
So we recommend to have a handshake between the FPGA and GPIF in the form of flags. Please refer to the below application note for more details. (Refer to Flag A and Flag B parts especially). The AppNote also comes with example project.