Loading market data...
ai

Meet Qwen-RobotSuite: Three Embodied AI Models for VLA Manipulation, Video World Modeling, and Navigation

MarkTechPost
Read Full Article at MarkTechPost
Share:PostShare
Ad Slot — In-Article (728x90)

We break down Qwen-RobotSuite, the Qwen team's three new embodied AI models. We cover RobotManip, a Vision-Language-Action model built on Qwen3. 5-4B for manipulation. We cover RobotWorld, a language-conditioned video world model with a 60-layer MMDiT.

We cover RobotNav, a navigation model built on Qwen3-VL across 2B, 4B, and 8B sizes. We walk through the architecture, data pipelines, and benchmark results for each.

This is a summary. For the full story, read the original article at MarkTechPost.

Original source: MarkTechPost

Ad Slot — Below Article (300x250)