Paper page – Perceptual Decoupling for Scalable Multi-modal Reasoning via Reward-Optimized Captioning