8 Replies Latest reply on Apr 15, 2019 4:04 AM by user_282047883

    Maximizing MIPI throughput with CX3

    user_282047883

      Hi,

       

      we are using CX3 to get data from image sensor (ONSemi AR0330CM) + image signal processor (ONSemi AP1302) over USB to host PC. We would like to get maximum possible frame rate, but we are kinda stuck now.

       

      We have configured all components to get 1920x1080 px, and we are able to get about 51 fps in YUV 4:2:2. In order to achieve this, we had to set THS-EXIT parameter in AP1302 to increase pause between the frames on MIPI interface. We believe that this parameter is limiting us from getting faster frame rates. We have development kit for sensor + ISP and we have verified that same settings on the kit yield same results, and reducing THS-EXIT causes fps increase.

       

      However if we reduce THS-EXIT on our hardware, we do not get picture from the UVC interface, instead there are errors reported in MIPI error thread, namely "Unrecoverable Sync Byte Error".

       

      I would like to understand why this happens, but I do not see any reason why there would be some pause between packets required in order to get CX3 interface working as intended. I have thought that there should be no need to any pauses or waits as CX3 MIPI received internally uses multiple threads with separate buffers and is able to switch instantly and seamlessly between the threads writing to different buffers.

       

      Can anyone possibly explain how is THS-EXIT related other parameters and give us some hint of what we can change so we can reduce this value even further?

       

      Thanks a lot for any ideas, hints and other pieces of information that might help us.

       

      EDIT 04/04/2019:

      After some more experimenting we have concluded that value required for THS-EXIT depends on packet size. If we increase packet size, we have to increase THS-EXIT as well. It looks like CX3 does need some time to process the data it has received into it's buffers before it can receive another packet.

       

      Can anyone clarify if there is any strict requirement for pauses between MIPI packets, and how to calculate required pause length?

       

      Best regards

      Ivo

        • 1. Re: Maximizing MIPI throughput with CX3
          KandlaguntaR_36

          THS-Exit is the Time that the transmitter drives LP-11 following a HS burst. This value is transmitter specific. As per the Spec. minimum value is 100 ns.

           

          You need to set this parameter as per the MIPI transmitter (manufacturer) specification.

          On the receiver side, we do not control this parameter.

          • 2. Re: Maximizing MIPI throughput with CX3
            user_282047883

            Hi Kandlagunta,

             

            thanks for your reply.

             

            We can set it on transmitter side (ISP ONSemi AP1302), however if we set it below certain value, there is no image coming from CX3 over UVC, and it signals Unrecoverable Sync Byte Error. We have confirmed that minimal possible value directly depends on configured MIPI packet size, which seems odd to us. Sensor and ISP itself works just fine even with THS-Exit set to 0, but there is some FPGA instead of CX3, so the receiver is different.

             

            So far everything points to CX3, we think it has some kind of bandwith/throughput limitation, possibly caused by how it is implemented internally, but that kind of wild guess. Can you confirm that there is no need to have pauses of certain length between the MIPI packets in order to get CX3 MIPI receiver working?

             

            Also, is there detailed description of MIPI errors, so we can understand what exactly does Unrecoverable Sync Byte Error means in this case? It would be great to know what indicates.

             

            Best regards

            Ivo

            • 3. Re: Maximizing MIPI throughput with CX3
              KandlaguntaR_36

              Hello,

               

              Please find the description of errors.

               

              frmErrCnt;          /**< Framing Error Count*/   - This field is incremented when an un-expected Frame start or Frame end short packet is received

               

              crcErrCnt;          /**< CRC Error Count*/ - This counter is incremented when a HS packet is received with crc errors.

              mdlErrCnt;          /**< Multi-Data Lane Sync Byte Error Count*/ - This counter is incremented when HS Sync Bytes were not received at the same clock cycle by CSI2 Rx

               

              ctlErrCnt;          /**< Control Error (Incorrect Line State Sequence) Count*/ - This counter is incremented when escape mode is exited using the wrong sequence

               

              eidErrCnt;          /**< Unsupported Packet ID Error Count */ - This counter is incremented when a HS packet that is not supported by CSI-2 Rx is received.

               

              recrErrCnt;         /**< Recoverable Packet Header Error Count*/ - This counter is incremented by when a HS packet header is received with errors that are correctable by ECC.

               

              unrcErrCnt;         /**< Unrecoverable Packet Header Error Count*/ - This counter is incremented when a HS packet header is received with errors that are not correctable by ECC

               

              recSyncErrCnt;      /**< Recoverable Sync Byte Error Count*/ - This counter is incremented when a HS Sync Byte was received by Rx with correctable errors

               

              unrSyncErrCnt;      /**< Unrecoverable Sync Byte Error Count*/ - This counter is incremented when a HS Sync Byte was received by Rx with uncorrectable errors.

               

              Can you please tune the THS_Exit parameter to receive the video streaming at lesser THS_Exit value?

              • 4. Re: Maximizing MIPI throughput with CX3
                user_282047883

                Kandlagunta, thanks for error description, hopefully it will help us.

                 

                Can you please tune the THS_Exit parameter to receive the video streaming at lesser THS_Exit value?

                Well, that's what we are trying to do. However under certain value it simply stops working as CX3 MIPI receiver fails to get data, probably due to broken synchronization as it increases unrSyncErrCnt counter.

                 

                We think that AP1302 do not use any blanking at all (I am still waiting for confirmation from ONSemi, getting support for this device is bit difficult), but right now everything seems like that, please check with me.

                 

                We do not have proper MIPI D-PHY analyzer, but we have measured using oscilloscope that MIPI goes into HS mode for 13.6 us. We have configured 1920x1080p, YUV 4:2:2, so there are 16 bits per pixel. Packet length is configured to be two times horizontal resolution, so it should be 3840 bytes. MIPI interface uses 4 lanes at 576 MHz.

                 

                Since we have 1920 pixels at 16 bits per pixel and 4 lanes, we should have 7680 bits per each lane for one frame line:

                1920 * 16 / 4 = 7680

                 

                At 576 MHz, you can fit approximately 7834 sampling intervals into measured HS time of 13,6 us:

                13.6e-e * 576e6 = 7833.6

                 

                As MIPI has to settle in HS mode and there is some synchronization besides the data itself, it seems quite likely that there is no blanking transmitted from AP1302 to CX3. The difference between 7833,6 and 7680 is simply these additional transitions and signals.

                 

                After one line is transmitted, THS-Exit comes in place and forces MIPI interface to stay in LP mode for certain time. The minimum value we can use (CX3 will still synchronize to data and transmit the image to host PC) is 3.5 us. We think that pause between the packets caused by THS-Exit effectively acts as horizontal blanking interval, as it gives time to CX3 MIPI receiver to move data from line buffer to buffers in GPIF interface.

                 

                We suspect that the issue might be caused by CX3 MIPI receiver not having enough time to move the data between the buffers. This should be fixable by increasing parallel output clock, but we have tried that yesterday and it appears to be working in very limited range on frequencies around 60 MHz, but not higher than that. We are not sure why that happens yet...

                 

                I will discuss it with my colleagues and provide more information about current CX3 MIPI receiver configuration that works for us, albeit with lower frame rate than intended.

                • 5. Re: Maximizing MIPI throughput with CX3
                  KandlaguntaR_36

                  CX3 MIPI receiver is recommended to operate up to 500 MHz (maximum data rate supported per lane is 1 Gbps). It seems that you are operating at 576 MHz. Please operate below 500 MHz for appropriate functionality.

                   

                  The parallel clock of CX3 MIPI (output of MIPI block and input to GPIF II) can be operated up to 100 MHz.

                  Please share the snapshots of MIPI Receiver configuration tab for review.

                  • 6. Re: Maximizing MIPI throughput with CX3
                    user_282047883

                    Kandlagunta, thanks for the warning. We have realized that we are confused about the clocks being used.

                     

                    I have to correct myself, the clocks from AP1302 are in fact 288 MHz, not 576 MHz. I have confused that because AP1302 outputs this in field documented as MHz, but in fact it is Mbps instead. We have measured the clock using oscilloscope and that confirmed that clocks are at 288 MHz.

                     

                    Currently used MIPI receiver configuration is following:

                     

                    CY_U3P_CSI_DF_RGB888,  /* CyU3PMipicsiDataFormat_t dataFormat */

                    4,   /* uint8_t numDataLanes */

                    2, /* uint8_t pllPrd */

                    75, /* uint16_t pllFbd */

                    CY_U3P_CSI_PLL_FRS_500_1000M, /* CyU3PMipicsiPllClkFrs_t pllFrs */

                    CY_U3P_CSI_PLL_CLK_DIV_8, /* CyU3PMipicsiPllClkDiv_t csiRxClkDiv */

                    CY_U3P_CSI_PLL_CLK_DIV_8, /* CyU3PMipicsiPllClkDiv_t parClkDiv */

                    0,  /* uint16_t mClkCtl */

                    CY_U3P_CSI_PLL_CLK_DIV_4, /* CyU3PMipicsiPllClkDiv_t mClkRefDiv */

                    1920,  /* uint16_t hResolution */

                    0  /* uint16_t fifoDelay */

                     

                    I have to admit that my colleague has obtained this configuration with trial-and-error as the configuration obtained from calculator was not working. We can change pllFbd in range small range (73 to 76 I think), outside of this range we do not get picture from CX3 (sync error). We have also tried increasing fifoDelay, but we were not able to get better results with that. We were able to get broken pictures (vertical split) with wrong fifoDelay value.

                     

                    To get things ever more confusing, we are able to get up to 65 fps using only 2 MIPI lanes which should not be possible as bitrate is higher than 2 Gbps (2 lanes per 1 Gbps each). But it in fact is best result (highest frame rate) we have ever obtained. MIPI receiver configuration is following:

                       CY_U3P_CSI_DF_RGB888,  /* CyU3PMipicsiDataFormat_t dataFormat */

                       2,   /* uint8_t numDataLanes */

                       3, /* uint8_t pllPrd */

                       124, /* uint16_t pllFbd */

                       CY_U3P_CSI_PLL_FRS_500_1000M, /* CyU3PMipicsiPllClkFrs_t pllFrs */

                       CY_U3P_CSI_PLL_CLK_DIV_8, /* CyU3PMipicsiPllClkDiv_t csiRxClkDiv */

                       CY_U3P_CSI_PLL_CLK_DIV_8, /* CyU3PMipicsiPllClkDiv_t parClkDiv */

                       0,  /* uint16_t mClkCtl */

                       CY_U3P_CSI_PLL_CLK_DIV_2, /* CyU3PMipicsiPllClkDiv_t mClkRefDiv */

                       1920,  /* uint16_t hResolution */

                       1  /* uint16_t fifoDelay */

                     

                    I have tried getting correct configuration using MIPI receiver configuration tool, but it always complains about CSI clock being too slow for frame rate we want, see following screenshot.

                    1080p70.png

                     

                    As you can see, there is minimum CSI clock value of 435.46 (how is that even calculated?), and it also shows impossible minimum limit for output pixel clock. I really have no idea how are we supposed to configure it if there are contradicting requirements (value should be min. 145 and max. 100 at same time).

                     

                    I am sorry for not being able to provide better information, we are not very familiar with MIPI so far, so we are still bit confused about some areas.

                     

                    I will be very glad for any kind of help. If you need more information from me, just let me know. I will try to provide everything.

                     

                    Best regards

                    Ivo

                    • 7. Re: Maximizing MIPI throughput with CX3
                      KandlaguntaR_36

                      Hello,

                       

                      As per your test, the MIPI lanes are working at a data rate more than 1 Gbps.

                      But you are not supposed to operate them at more than 1 Gbps. Please use four lanes.

                       

                      The MIPI configuration tool has some issues in calculating the minimum CSI clock and parallel clock. We are going to fix them in next revision SDK.

                       

                      Please refer the following KBAs to understand the calculations done by the tool. 1. Analysis of CX3 Video Timing parameters 2. Analysis of CX3 Clocking Parameters

                       

                      As per the 1. KBA, the minimum CSI clock required is 291 MHz. As per the above settings (screen shot of CX3 Receiver configuration tool), I have generated the following MIPI configuration. Please use the same.

                       

                      CY_U3P_CSI_DF_RGB888,  /* CyU3PMipicsiDataFormat_t dataFormat */

                         4,   /* uint8_t numDataLanes */

                         2, /* uint8_t pllPrd */

                         123, /* uint16_t pllFbd */

                         CY_U3P_CSI_PLL_FRS_250_500M, /* CyU3PMipicsiPllClkFrs_t pllFrs */

                      CY_U3P_CSI_PLL_CLK_DIV_4, /* CyU3PMipicsiPllClkDiv_t csiRxClkDiv */

                          CY_U3P_CSI_PLL_CLK_DIV_4, /* CyU3PMipicsiPllClkDiv_t parClkDiv */

                          0,                 /* uint16_t mClkCtl */

                          CY_U3P_CSI_PLL_CLK_DIV_2, /* CyU3PMipicsiPllClkDiv_t mClkRefDiv */

                          1920,         /* uint16_t hResolution */

                          50                         /* uint16_t fifoDelay */

                       

                      Configure the PHYDelay of 11 using CyU3PMipicsiSetPhyTimeDelay API.

                       

                      Please test the device with above settings and probe the HSYNC, VSYNC, PCLK test pins; measure the active and blanking of HSYNC and VSYNC, and frequency of the clock (parallel clock).

                      • 8. Re: Maximizing MIPI throughput with CX3
                        user_282047883

                        Hi Kandlagunta,

                         

                        thanks for your response. I have checked referenced KBA and recalculated minimal CSI input clock to better understand it. I have obtained minimum of 290.304 MHz, so that corresponds with your information.

                         

                        I have to apologize, I have mistakenly wrote that reference clock speed is 19.2 MHz, but we in fact use 24 MHz (in have swapped that for CLKIN on CX3). We changed REFCLK value to 24 MHz and adapted Multiplier of Unit Clk to 99 so we get 99 MHz Output Pixel Clock.

                         

                        Adapted configuration structure:

                            CY_U3P_CSI_DF_RGB888,  /* CyU3PMipicsiDataFormat_t dataFormat */

                            4,                          /* uint8_t numDataLanes */

                            2, /* uint8_t pllPrd */

                            98, /* uint16_t pllFbd */

                            CY_U3P_CSI_PLL_FRS_250_500M, /* CyU3PMipicsiPllClkFrs_t pllFrs */ 

                            CY_U3P_CSI_PLL_CLK_DIV_4, /* CyU3PMipicsiPllClkDiv_t csiRxClkDiv */

                            CY_U3P_CSI_PLL_CLK_DIV_4, /* CyU3PMipicsiPllClkDiv_t parClkDiv */

                            0,                 /* uint16_t mClkCtl */

                            CY_U3P_CSI_PLL_CLK_DIV_2, /* CyU3PMipicsiPllClkDiv_t mClkRefDiv */

                            1920,         /* uint16_t hResolution */

                            50                         /* uint16_t fifoDelay */

                         

                        We have set our sensor and ISP to output 1080p30 (should be easired to achieve then 70 fps, and our MIPI clocks are at 288 MHz right now, so not enough for 70 fps, but for 69 or less it should be fine) with this configuration and obtained following results:

                          - PCLK outputs 99 MHz (as expected)

                          - VSYNC outputs 30 Hz (as expected)

                          - HSYNC is unknown, we unfortunately do not have this pin connected

                         

                        So far so good, it seems. However there is no output from UVC and there are actually no data coming over USB into host PC (checked with Wireshark). MIPI thread prints following output:

                        state 0xD glDMATxCount 0 glDmaDone 0 errCnts 1 1 0 0 0 0 0 0 0

                        state 0xD glDMATxCount 0 glDmaDone 0 errCnts 0 0 0 0 0 0 0 0 0

                        state 0xD glDMATxCount 0 glDmaDone 0 errCnts 0 0 0 0 0 0 0 0 0

                        ...

                        I am not sure if state value corresponds to what is described in Figure 7 of CX3 TRM (Doc. No. 001-91492 Rev. *B), but if it does, it suggests that state machine is stuck in State 13 which is Start. I believe that transition from Start to Wait for Frame Done should be automatic, so I do not understand why it is staying in Start.

                         

                        I believe that MIPI receiver is able to distinguish the frames as it outputs expected VSYNC pulses on respective pin, so I am not sure what is actually broken here.

                         

                        I will try to do more tests and hopefully provide more details.

                         

                        Best regards

                        Ivo