I had a very similar issue to your "500msec after start up delay" issue some time ago, and it turns out the issue was related to me using the WCO. Have a look at this formus discssion, as this might account for your 500msec delay as well
so it s possible to reduce the time to 10ms. Have you tried to get even faster startup?
I never looked into whether I could speed things up beyond the ~10msec timeframe I needed. From memory, I was seeing about a 5msec delay between the CPU powering up and my PWM signal becoming active, which was more than quick enough for what I wanted.
Someone that knows more about the internal workings of the chip start up might be able to point to a way to make things a little quicker if 5msec isn't quick enough for you