I do not see any P6 BGA design with 4 layers. The simplest one is 6 layers(CY8CMOD-062-4343W). It's attached here.
4 layers sounds pretty tough for this package.
CY8CMOD-062-4343W Layout.brd.zip 936.0 K
I agree with Wang that four layers sound too few for a BGA. It all depends on the number of IOs you need to connect to the chip.
Check out the following two app notes on calculating the number of layers and design guidelines for PCBs using BGA packages:
As to what regards not using vias in pads, this is entirely possible, but your PCB manufacturer needs to support small clearances.
Looks like we might be able to break out the signals on 2-3 layers based on the CY8CMOD-062-4343W design and the guidance in the Xilinx application note. The 2 outer rows can break out easy, the 2nd row of 40 pins can break out on the same layer as there are 48 routing channels available. Then you only need to break out 36 internal pins on 1 additional layer
So while it might not be optimal nor simple, it does look possible.
Yes, it might be possible though the work looks not easy.