$M^2$-VLA: Boosting Vision-Language Models for Generalizable Manipulation via Layer Mixture and Meta-Skills

返回详情 VLA / Vision-Language-Action 每日论文卡