ESD strike causes corrupt data being returned

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
damc_1337896
Level 3
Level 3
25 replies posted 10 replies posted 10 sign-ins

As the title states after striking our product containing an FPGA and FX3 chip vendor commands start sending corrupt data packets.

An example of expected data packet is 55, 33, 56, 43, 0, 0, 1, 8, 4, 0, 0, 0 but we are seeing 0, 0, 1, 8, 55, 33, 56, 43, d0, 1, 0, 0 it looks like each 4 bytes are getting swapped and random data is appended during the final 4 bytes.

We have seen this

We have added a simple vendor command that will return the character "A" when requested, no input from the FPGA, only the FX3.

<Code snip-it for the simple vendor command>

case 0xa1:

    Ep0_Buf[0] = 0x41;

    apiRetStatus = CyU3PUsbSendEP0Data(1, Ep0_Buf);

    break;

After getting the FX3 into this weird state and asking for the character a random byte is returned.

All vendor commands seem to be working anything going into the device works as expected but data coming out of the FX3 seems to be wrong.

I would like to know if there is a way to reset memory or something inside the FX3 to try and get the data pointers working again.

We would perfer not to have to reset the FX3 with the command CyU3PDeviceReset due to it causing the product to disconnect from the PC.

Thanks for any assistance.

0 Likes
1 Solution

We have seen various results, but this seems to be due to the usb card that we are using, tested built-in usb 3.0 ports and other usb 3.0 cards which we have been able to get our device to recover and continue to work.

The biggest change that we had to do in the GPIFtoUSB example code was to reset the FX3 when the CY_U3P_USB_EVENT_LNK_RECOVERY usb event has been detected.

We have more USB cards to test but no more changes to the FX3 code seem to be needed.

View solution in original post

0 Likes
23 Replies
SrinathS_16
Moderator
Moderator
Moderator
1000 replies posted 750 replies posted 500 replies posted

Hello,

Please use the CyU3PDeviceReset() API with the isWarmReset parameter set to CyTrue. In this case, the firmware starts executing from the beginning and the global variables are re-initialized. Kindly, test this and let us know if the data is still corrupted.

Best regards,

Srinath S

0 Likes

I have tried the API command CyU3PDeviceReset() with the isWarmReset parameter set to CyFalse, this will restore the data from our device.

I have also used the following link to test the reset command:

Re: Implementation of Soft Reset in FX3 family devices

But my problem is that the CyU3PDeviceReset causes the device to disconnect from the PC, this is what i dont want to happen.

Due to the PC software having to re-open the device again.

We also use a second-stage bootloader so should we be jumping back to to the bootloader before trying to use CyU3PDeviceReset?

We have also a simple vendor command in the second-stage bootloader that will return a  character "B" when requested.

When we are getting corrupt data, we can jump to the bootloader and its simple vendor command will give the expected data.

Thanks

0 Likes

Hello,

- Please let me know what is the boot option used in you second stage boot-loader?

- Are the descriptors that are used in the second stage boot-loader and the main application firmware the same?

- Are you setting the noReEnum parameter to CyTrue in the CyFx3BootUsbStart() API?

- When the noReEnum parameter is set to CyTrue and the firmware is loaded, there will be no re-enumeration but the firmware will be re-loaded and will start execution from the entry point. So, in case of data corruption, you can jump to second stage boot-loader (which sets the noReEnum parameter to CyTrue) and load the new application firmware.

Best regards,

Srinath S

0 Likes

the boot option is set as i2c with usb fallback

yes the descripters are the same for the second-stage boot loader and application code

yes the noReEnum parameter is set to CyTrue in the CyFx3BootUsbStart() API

sorry Im not following your last sentence.

During normal operation our code goes from second-stage bootloader to the application code this can be checked by the simple vendor commands added to both projects to return an "A" or a "B" character.

Normally this jump feature is only used when we want to upgrade the application code with products in the field, so it shouldnt be used.

I had only mentioned it as i wasnt sure if the CyU3PDeviceReset() API would need to be called from the boot code or the application code.

Are you suggesting that we jump to the boot code and try to upload a new application to RAM? If so will this not also cause a re-emumeration?

Thanks

0 Likes

Hello,

Please see the below sequence.

- Data corruption is observed.

- A jump to the second stage boot-loader is performed.

(Ensure that the second stage boot-loader has got the option to load firmware into the RAM through USB with noReEnum parameter set to CyTrue)

- The main application firmware is loaded.

- No re-enumeration will be observed since the noReEnum is set to CyTrue.

- Perform vendor commands.

Kindly, test the above sequence and let me know the observation.

Best regards,

Srinath S

0 Likes

During normal operation we store a merged second-stage bootloader and application code in the i2c eeprom, on power up the second-stage bootloader is executed which jumps to the application code.

After striking the device our software stops communicating to the device due to data corruption.

Ideally we would want a way to recover the corruption here without having the device from having to re-enumerate.

To determine if data corruption is only on data going out of the FX3 to the pc or in both directions i've sent vendor commands that should return known data. I can see that it appears to only be on data from the FX3 that is corrupt. Data getting sent to the FX3 seems to pass through correctly.

This is also when i tested trying to jump back the the bootloader code to check if it is also corrupted, which doesnt appear to be.

Our problem with the above sequence would be that the main application code would have to live on the PC and get loaded when an esd strike occurs.

What we are currently trying to add is a vendor command into the bootloader to allow jumping from the bootloader into the application code again to see it this will resume communications again.

which so far we have had no luck.

I will also try to an application that can upload a version of the application code to the device, i can use the cypress control cender due to having to swap drivers which would cause the device to disconnect.

0 Likes

Hi,

we were able to perform the steps that you have mentioned above:

we got the FX3 into the corrupt state, jumped to the second stage bootloader and using a custom application we uploaded the main application again. No renumerations was observed.

I modified our application code to know that it has started, and performed a vendor command that give access to the eeprom where the fx3 code is stored, on writing into the eeprom with a known string then reading this data out again i have to peform multiple read accesses before i start to get the correct information out.

In a non-corrupt state i only need to perform one read access to get the required data, this data corruption seems to be for is to both an FPGA and eeprom both of which are connected to the FX3 over I2C.

Is there any other options i could try?

0 Likes

Hello,

Please check if the I2C block was stopped before jumping to the second stage boot-loader. All the blocks that were running in the firmware need to be stopped before jumping to the second stage boot-loader.

Best regards,

Srinath S

0 Likes

Hi,

Below is our jump to bootloader code which will stop the i2c block.

case Jump_Boot:/* Request to switch control back to the boot firmware. */

    CyU3PUsbAckSetup ();

    CyU3PThreadSleep (10);

    AppStop(); // reset endpoints/destroys DMA channels

    CyU3PDebugDeInit ();

    CyU3PUartDeInit ();

    CyU3PI2cDeInit();

    CyU3PUsbSetBooterSwitch (CyTrue);

    CyU3PUsbJumpBackToBooter (0x40078000);

    while (1)

        CyU3PThreadSleep (100);

    break;

which should be resetting the i2c block, would there be any other blocks i would have to reset?

Thanks

0 Likes

Hello,

- The CyU3PI2cDeInit() API is sufficient to stop the I2C block.

Can you please probe the I2C lines during the corrupt data that is being read on the host? This would help us identify if the corrupt data is from the FX3 DMA buffers or is it because of some data corruption in the EEPROM.

Best regards,
Srinath S

0 Likes

we will need to check if this is possible, due to having the product enclosed. Im not sure if we are able to open the casing while the power is still on.

I will have a look and get back.

0 Likes

We were able to get a way of monitoring the i2c bus after corruption has occurred.

we have loaded two messages into the eeprom:

"Hello world first message" and "Second message in the eeprom"

After powering up the device and reading out the two strings we get the correct strings back.

After getting the FX3 into the corrupt state and then asking for the same strings back we get:

shown in HEX:"0x55,0x33,0x56,0x43,0x03,0x80,0x01,0x08,0x00,0x00,0x01,0x08,0x01,0x00,0x00,0x00,0x0C,0x00,0x00,0x00,0x00"

and "o woHellfirstld ssagt me<null><null><null><null>"

The i2c track shows the same trace in the corrupt and non-corrupt state being read out of the i2c eeprom so there is no corruption on the eeprom side.

0 Likes

Hello,

Please share the main application firmware and the second stage boot-loader firmware source code. If it is confidential, please share over the private message option.

Best regards,

Srinath S

0 Likes

private message has been sent.

Thanks

0 Likes

have you had any luck with building my application code?

Some observations from our side:

During normal operation we have two endpoints is use one bulk (data channel) and the other the control endpoint (communication channel).

If the device is idle (not sending data but communications still transfering) we can seem to strike the device consistantly without corruption of communciations channel.

If we are transfering both data and communications and strike the device after several strikes we see corruption on the communications channel (corruption also seen on the data channel), if we slow down the transfer of the data channel we dont see corruption.

When in the corrupted state the device seems to still respond correctly with the expected number of bytes requested just that the contents of the bytes are wrong.

0 Likes

Hi we havent heard anything from cypress in a while to try and get this ESD failure sorted.

We have still been trying to perform some test our selves. we seem to have two failures either we get corrupt data out of the FX3 or we get no data out of the FX3.

After getting the device into a corrupted state and trying to open the software application we can see that no data is getting sent out of the fx3 attached is the trace from out usb analyser when no data is sent out:

usb-trace_nack.PNG

compared with when the FX3 will return data:

usb-trace_ack.PNG

the code behind this vendor command is below:

case 0xA1: // software is asking if microcode is Application or Bootloader

    Ep0_Buf[0] = 0x41;  //Application will sent "A" boot loader will sent "B"                               

    apiRetStatus = CyU3PUsbSendEP0Data(1, Ep0_Buf);

    break;

Im just trying to figure out why Endpoint 0 would be Nacking the data.

0 Likes

Hi,

Sorry for the delay.

I have a doubt regarding the interaction 2 wherein you mentioned that you are able to solve the problem using the CyU3PDeviceReset() API, can you please explain the steps that were taken to achieve this?
Also, was the CyU3PDeviceReset() API used with CyTrue or CyFalse?

Can you please try the steps that i am going to mention below and get back the results:

1.) The second stage bootloader gets loaded into the RAM first with the descriptors same as the application descriptors with noReEnum parameter is set to CyTrue in the CyFx3BootUsbStart() API.

2.) The control jumps to the main application firmware and starts execution. The noReEnum is set to CyTrue in the CyFx3BootUsbStart() API here too.

3.) When ESD strike happens, the control jumps to the second stage bootloader and no re-enumeration should be observed.

4.) Then the control again jumps back to the application firmware and you should still not observe any re-enumeration.


Please try the above steps and check if you can notice any different behaviour.

There should essentially be no re-enumerations that will be observed in the above sequence.

Regards,
Yashwant

0 Likes

Hi thanks for getting back.

We use CyU3PDeviceReset = CyFalse  to get the FX3 back into a usable state. So that the image file would have to be re-read out of i2c eeprom.

The steps that you have mentioned above have already been tested we didnt see any re-enumeration but the data that was being returned was still corrupt every other 4 bytes were in the wrong order.

0 Likes

Hi,

Can you please add another vendor command and call the following API's:

    CyU3PUsbFlushEp (0x00);

    CyU3PUsbResetEp (0x00);

    CyU3PUsbFlushEp (0x80);

    CyU3PUsbResetEp (0x80);

>>As soon as ESD stike happens, you call this vendor command over control endpoint.

>>Then request for data from the EEPROM using someother vendor command that you have already added.

Passing that vendor command flushes and resets the control endpoint and resets the channel.

Please use the above method and get back to us with the outcomes.


Regards,
Yashwant

0 Likes

Hi,

Have you had any luck to implement the method i explained in the above interaction?

If yes, can you please share the findings?

Regards,

Yashwant

0 Likes

I have remade our firmware based on the GPIF to USB example in the cypress SDK version 1.3.3.

We havent seen corruption while testing this new firmware, even when the functionality should be the same as the one version.

We are now doing in house ESD tests to see if this is a suitable for our needs.

thanks

0 Likes

Hi,

Is the new firmware suitable to your application?

How did the ESD tests go?


Can you please share the findings?

Regards,
Yashwant

0 Likes

We have seen various results, but this seems to be due to the usb card that we are using, tested built-in usb 3.0 ports and other usb 3.0 cards which we have been able to get our device to recover and continue to work.

The biggest change that we had to do in the GPIFtoUSB example code was to reset the FX3 when the CY_U3P_USB_EVENT_LNK_RECOVERY usb event has been detected.

We have more USB cards to test but no more changes to the FX3 code seem to be needed.

0 Likes