While the actual details depends on which external STM you are planning to use, I doubt you will be able to fit the entire image to either SRAM or EEPROM before sending it over to the STM.
So you will have to send portions of this over to the STM via SRAM buffers
I would think that writing a custom app would be better than leveraging the OTA scheme as-is since you probably want more fine-grained control over what gets sent to the external MCU and not just overwrite the entire Flash image. But if that is what you want to do, then using appropriate file-offsets as you have suggested would work.
BTW - we do have customers doing exactly what you are attempting to do, but usually they write a custom framework for this purpose that relies on the OTA scheme in the lower levels but still does separate out the two images