Science

Robot Barista breaks new foundations for AI machines

Forget the clumsy robots confined to factory floors. Now, the new AI-powered robotic arm can prepare morning coffee while seamlessly adapting to the mess of the kitchen – even if you accidentally bump the cup while pouring it in.

Researchers at the University of Edinburgh have developed a sophisticated robotic system that follows verbal commands, navigate in unfamiliar environments, and perform complex tasks that require sophisticated touch and quick adaptation to unexpected changes.

The study, published Wednesday in Natural Machine Intelligence, demonstrates how combining advanced language processing with precise sensory feedback creates machines capable of running in unpredictable environments, which has long challenged robotic engineers.

“We are glimpsing a future that makes robots with increasingly intelligent heightened intelligence commonplace,” said Ruridh Mon-Williams, principal investigator at the University of Edinburgh School of Informatics. “Human intelligence stems from the integration of reasoning, motion and perception, but AI and robots often advance separately.

The Edinburgh team’s robot, called Ellmer (a robot that embodies LLM), represents a major change in understanding the way machines are designed and interacting with the world. Unlike traditional robots that rely on pre-programmed responses, Ellmer will resemble Chatgpt’s large language model (LLM) similar to complex sensors that provide constant visual and tactile feedback.

This approach echoes an increasing scientific consensus that human intelligence is fundamentally “embodied cognition”, in which our thinking process is inseparable from the interaction between our bodies and our environment.

“If Deep Blue (the first computer to win chess competition with the world champion) is really smart, shouldn’t he be able to move his own work on his own while playing chess?” the researchers pointed out in the paper, emphasizing the limitations of intangible AI systems.

Seven connected robot arms can respond to “I’m tired, my friend will soon be able to let me drink a hot drink and decorate the plate with random animals of your choice. The system’s language model can explain this request, decide that coffee is suitable for tired people, and break the task down into manageable steps.

Beyond strict programming

Traditional robots perform well in controlled environments such as assembly lines, with each movement predefined and obstacles remaining constant. But they often stagger in dynamic environments like kitchens, objects move and unexpected challenges.

Ellmer overcomes these limitations with constant sensory feedback. The force sensor on the robot’s “wrist” detects the pressure to open a drawer, pour water, or draw on a plate. Meanwhile, the depth camera provides visual information about the location and movement of the object.

This sensory information can be returned to the system in real time, allowing Ellmer to adjust its movements immediately – for example, if someone moves the cup while making coffee, adjusts its inclination.

“The integration of GPT-4 was found to equip the robot with the capabilities required for abstract reasoning,” the researchers noted in their study. “Our system is able to generate code and perform actions with force and vision feedback, effectively providing a form of intelligence for the robot.”

Cultural knowledge and artistic expression

In addition to actual tasks, Ellmer demonstrates creativity through a technology called the Retrieval Function (RAG). This allows it to access and apply context-related examples from the knowledge base, similar to how humans draw on accumulated cultural knowledge.

In a demonstration, when asked to decorate the board with a “random animal”, the system uses an image generation model to create the animal profile, then presses the profile on the board using a consistent pen controlled by force feedback, and then draws the profile on the board.

The researchers evaluated other methods and found that using a rag can significantly increase the robot’s loyalty – its ability to perform tasks accurately without “illusion” or the ability to create improper solutions.

Future applications and challenges

Elmer successfully directed the challenge of the Coffee Challenge, but the researchers acknowledged some limitations. Current systems require a reasonable and tidy environment, and sometimes struggle with visually complex scenes or highly blocked objects.

The visual system can correctly identify white coffee cups 100% of the time under ideal conditions, but the accuracy drops sharply to about 20% when the cup is 80-90% covered by other objects.

At medium speed, the pouring accuracy achieved is about 5.4 grams per 100 grams per 100 grams, but at higher pouring speeds, the error increases significantly, reaching about 20 grams per second at the maximum speed.

Despite these challenges, the technology exhibits promising capabilities that can go far beyond kitchen tasks.

“Elmer’s potential extends to creating complex and artistic movements,” the researchers noted. “For example, models such as DALL-E allow trajectories to be derived from visual inputs and open up new avenues for robotic trajectory generation.”

As sensing technology improves, language models become increasingly complex, and robots such as Ellmer can quickly help a variety of home and professional environments, potentially changing how humans and machines collaborate in unpredictable environments.

The research was supported by the Engineering and Physical Science Research Council (EPSRC), led by Mon-Williams, in partnership with global building materials company CEMEX, and co-doctoral students from the University of Edinburgh, Massachusetts Technical College and Princeton University.

If you find this piece useful, consider supporting our work with a one-time or monthly donation. Your contribution allows us to continue to bring you accurate, thought-provoking scientific and medical news that you can trust. Independent reporting requires time, effort, and resources, and your support makes it possible for us to continue exploring stories that are important to you. Together, we can ensure that important discoveries and developments attract those who need them the most.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button