Welcome in the forum!
When you compile a "normal" BLE example project, you will see that the flash and sram consuming component is the BLE itself. Flash usage of 80% are quite common for implementing the BLE stack. This is the reason why the PSoC4 BLE chips have got quite more flash and sram than the originally introduced Cy8C42xx devices. When you compare what is actually left unused (20% flash from 128kB is 25kB) with what you get in the "normal" PSoC4 (4kB) you are fine off.
I am getting far worse figures than this.
Compiling the BLE shared memory bootloader example project with no changes gives:
Flash used: 120800 of 131072 bytes (92.2 %). Bootloader: 99456 bytes. Application: 21344 bytes
SRAM used: 14008 of 16384 bytes (85.5 %). Stack: 2048 bytes. Heap: 1024 bytes.
The 92% includes the application but the actual user code of this is tiny, just one page.
I managed to get it down to 84% by manually changing the bootloadable code base address by looking at the highest address in the bootloader MAP file. I think there is a memory hole in the default settings. Not sure how the default is assigned. I have not tested it actually still works but it should do.
As far as RAM goes, the bootloader appears to allocate 300 bytes for communication buffers and these are assigned at least twice, when only 144 are needed.
But, am I correct in saying that RAM in the bootloader is not an issue as it is available to the application when not in bootloader mode?
I would suggest a different approach for OTA bootloading to reduce flash usage.
The PSoC4 BLE pioneer kit has got an FRam memory chip on board which can be accessed via I2C.
A simple bootloader that retrieves the bootloadable from the FRam does not consume much resources.
A more complex bootloader can be programmed to first save the actual bootloadable into another area in FRam before overwriting.
With the bootloadable's BLE stack the OTA transfer can be performed and the data be stored in the FRam