Beyond Human Demonstrations: Diffusion-Based Reinforcement Learning to Generate Data for VLA Training