It depends on how deterministic your code is. If you can exercise all of the paths and know that you have all paths covered, then you could set a pattern on a stack that is larger than you think that you need and after exercising, look at how much of the original pattern was not touched. Beats calculating everything that may go on the stack by hand.
If your code can take many different paths based on various inputs and it is not deterministic, then things become much harder to add your limits... hope that you fit in the first category.
As for printf, it is a large user of memory. If you are in an embedded piece of code, it is always best to avoid routines of this nature when you can. I would guess that the 4K depends on the size of the output and the type of formatting that you are using. As this is in a library, I am not sure that I would get a static view of what is used today and hope that this will never change at the risk of overflowing your stack.