OCTO-MODELS.GITHUB.IO
Updated 56 days ago
We introduce Octo , our ongoing effort for building open-source, widely applicable generalist policies for robotic manipulation. The Octo model is a transformer-based diffusion policy, pretrained on 800k robot episodes from the Open X-Embodiment dataset. It supports flexible task and observation definitions and can be quickly finetuned to new observation and action spaces. We are introducing two initial versions of Octo, Octo-Small (27M parameters) and Octo-Base (93M parameters)... The design of the Octo model emphasizes flexibility and scale: the model is designed to support a variety of commonly used robots, sensor configurations, and actions, while providing a generic and scalable recipe that can be trained on large amounts of data. Octo supports both natural language instructions and goal images, observation histories, and multi-modal action distributions via diffusion decoding. Furthermore, we designed Octo specifically to support efficient finetuning to new robot setups,..