Low-Res_roboAgent 01 (2)

Carnegie Mellon University and Meta researchers have announced RoboAgent, an AI agent that leverages passive observations and active learning to enable a robot to acquire manipulation abilities on par with a toddler. (Credit: Carnegie Mellon University)

PITTSBURGH — Babies and toddlers learn by exploring their surroundings and now robots can too. In a groundbreaking collaboration between Carnegie Mellon University and Meta, scientists have drawn inspiration from the way infants learn to create an innovative approach to teaching robots. The result is RoboAgent, an artificial intelligence agent designed to emulate a toddler’s learning process and acquire manipulation skills equivalent to a three-year-old child.

“We aimed to create a single AI agent capable of a wide range of skills in novel situations, similar to how human babies learn,” explains Vikash Kumar, from Carnegie Mellon’s School of Computer Science’s Robotics Institute. “RoboAgent leverages passive observations and limited active play, just like infants who keenly watch, imitate, and replay to learn.”

RoboAgent showcases proficiency in 12 manipulation skills across various scenarios, demonstrating a dynamic learning platform adaptable to changing environments. Unlike prior research conducted in simulations, this project successfully operated in real-world environments using notably less data.

“RoboAgents exhibit a greater complexity of skills than previous attempts,” states Abhinav Gupta, an associate professor at the Robotics Institute, in a university release. “Our agent demonstrates a diverse skill set that surpasses any real-world robotic agent’s achievements. It combines efficiency, scalability, and adaptability to unseen situations.”

The unique learning architecture of RoboAgent is the core of its effectiveness and efficiency. It employs temporal chunks of movements to make decisions, diverging from the traditional per-time step approach. This innovative policy structure facilitates reasoning even with limited experiences, enabling the agent to act according to specified goals.

RoboAgent’s learning process draws inspiration from the way children accumulate knowledge. Just as parents guide their offspring, researchers teleoperated the robot to provide valuable self-experiences. However, RoboAgent’s learning scope goes beyond its immediate environment.

“To overcome limitations, RoboAgent learns from internet videos, similar to how babies acquire behaviors by observing their surroundings,” says Mohit Sharma, a Ph.D. student in robotics. “These videos help RoboAgent learn how humans interact with objects and utilize skills to complete tasks. It extracts valuable lessons from different scenarios and applies them to new challenges.”

The team’s ambitious project aims to enhance robots’ adaptability in diverse settings.

“RoboAgent’s learning could lead us closer to a universal robot capable of a range of tasks in various environments,” states Shubham Tulsiani, an assistant professor from the Robotics Institute. “This platform could make robots more useful in unstructured spaces such as homes, hospitals, and public areas.”

The project’s impact is further amplified by its open-source approach. The team is sharing its trained models, codebase, hardware drivers, and an extensive dataset, RoboSet, which is the largest publicly accessible robotics dataset on standard hardware. The goal is to foster collaboration and development within the robotics community, paving the way for a versatile and foundational general robotic agent in the future.

You might also be interested in:

About StudyFinds Staff

StudyFinds sets out to find new research that speaks to mass audiences — without all the scientific jargon. The stories we publish are digestible, summarized versions of research that are intended to inform the reader as well as stir civil, educated debate. StudyFinds Staff articles are AI assisted, but always thoroughly reviewed and edited by a Study Finds staff member. Read our AI Policy for more information.

Our Editorial Process

StudyFinds publishes digestible, agenda-free, transparent research summaries that are intended to inform the reader as well as stir civil, educated debate. We do not agree nor disagree with any of the studies we post, rather, we encourage our readers to debate the veracity of the findings themselves. All articles published on StudyFinds are vetted by our editors prior to publication and include links back to the source or corresponding journal article, if possible.

Our Editorial Team

Steve Fink


Chris Melore


Sophia Naughton

Associate Editor

1 Comment

  1. PJ London says:

    Wouldn’t it be nice if people spoke English instead of the pseudo-science-babble-speak that they employ.
    They do not sound ‘learned’, they do not sound smart, they sound stupid.
    I think it was Einstein who said “If you can’t explain it to a six-year-old, then you don’t understand it.”