Robot Learning

The generative AI revolution, as we see it in ChatGPT and Midjourney, is based on training massive neural networks with data from the internet. However, this formula does not easily translate to robotics, as the Internet lacks data on robotic interaction. Robots need specific data to learn, which is usually generated slowly and tediously in laboratories. This limits the robots’ abilities to specific tasks and environments.

The RT-X project, a collaboration of 34 robotics laboratories worldwide, seeks to change this. The initiative brings together data, resources and code to make general-purpose robots a reality. The key lies in the concept of “cross-incarnation,” that is, training a single neural network to control various types of robots, regardless of their appearance or capabilities.

Robot

The RT-X dataset, the largest of its kind, contains nearly one million robotic tests with 22 types of robots performing a wide range of tasks, from object manipulation to assembly. Surprisingly, relatively simple machine learning models, similar to those used in ChatGPT, were able to control different robots using this data set. The model recognizes the type of robot it is controlling from its camera observations and sends appropriate commands.

In comparative tests, the RT-X model outperformed each laboratory’s specific control systems, demonstrating the effectiveness of multi-robot learning. The model takes advantage of the diversity of experiences of other robots to improve its robustness and adapt to different situations.

The RT-X project also explored combining robotic data with internet-scale text and image information. This allows robots to perform tasks that require semantic reasoning, such as understanding spatial relationships or following complex instructions. For example, a mobile robot was able to respond to commands such as “Move the apple between the can and the orange” or “Place an object on a piece of paper with the solution to ‘2+3′”, demonstrating a rudimentary capacity for reasoning and Problem resolution.

These initial results are promising and suggest that cross-embodiment robotics models could revolutionize the field. A single base model could be the basis for a wide range of robotic tasks, adapting to new abilities through adjustments or even prompts, similar to how we interact with ChatGPT.

RT-X is just the beginning. As more laboratories join the collaboration, it is expected to advance the capabilities of robots with universal brains, benefiting from global data sharing and opening the door to a future where robots are truly general purpose.

News Safari

Leave a Reply

Your email address will not be published. Required fields are marked *

Verified by MonsterInsights