What are the AI 'World Model', and why does it matter?

World models, also known as World Simulators, have been considered by some people as the next big thing in AI.

The World Labs of the AI’s Pinner Fei Lee has raised $ 230 million to create a “major global model”, and Deep Mind has kept one of the Openi video generator, Surah’s creators to work on “World Simulators”. (Surah was released on Monday. Here are some initial comments.)

But what is the hack? Are These things?

Global models are influenced by the world’s mental patterns that humans naturally develop. Our brains take summary representation from our senses and give them a more concrete understanding of the world around us, from which it is ready, which we call the “model” long before the AI adopted this phrase. Based on these models, the predictions that our brains make affects how we understand the world.

A Paper David Ha and Jurjin Smdhubar cited the baseball batsman by AI researchers. The batsmen have a second to decide how to swing their bat – smaller than that time. Ha and Shamdobar say that the reason for being able to target 100 miles per hour is that they can easily predict where the ball will go.

The research pair writes, “For professional players, all this happens consciously.” “Their muscles swallow the badge at the right time and location in accordance with the predictions of their internal models. They can work quickly on their predictions about the future so that a plan is needed to consciously eliminate the potential landscapes of the future.”

These are the aspects of the conscious reasoning of global models, which some people believe is the condition of human level intelligence.

Taxkarnch event

Berkeley, ca
|
June 5 June

The book right now

Modeling the world

Although this concept has been going on for decades, world models have recently gained partially popularity in the field of generative video.

Most, if not all, AI-generated videos revolve around the Inkly Valley area. See them quite a while and Bizarre It will be, as if the organs rotate and find them in each other.

Although a generating model trained on the video of the years can predict that basketball is a bounce, in fact there is no idea why – language models do not really understand the words and phrases behind the phrase. But with a global model, even with its basic grip, it would be better to show why basketball bounces.

To enable such insights, global models are trained on numerous data, including photos, audio, videos and texts, which have the ability to argue the world’s internal representation, and to argue the consequences of functions.

Runway General 3 — A sample of the general 3 video generation model of the AI startup runway. **Image Credit:**Runway

“A viewer expects the world to behave in the same way that he is seeing,” said a former SNAP former AI chief AI chief AI chief chief and CEO of Hygis Field, who is making a generatic model for the video. “If a feather falls with the weight of a feather or the bowling ball has shot hundreds of feet in the air, it turns around and takes the viewers out of this moment. With a strong global model, a creator is explained instead of how to use every item, and how to use it.

But improved video generation is just the tip of the iceberg for global models. Researchers, including meta chief AI scientist Yan, say that models can someday to be used for sophisticated predictions and planning in digital and physical circle.

In a conversation earlier this year, but explained how global model can help achieve the desired purpose through reasoning. A model that has the basic representation of the “world” (such as a dirty room video), which is aimed at (a clean room), can take a series of actions to achieve this goal (to sweep the vacuum, to clean the dishes, to clean the trash) because of the reason why it is clear.

But “we need machines that understand the world. (Machines) that can remember things that are intuitive, common understanding – things that can make the same level of argument and plan like humans.” “Yet what you may have heard from some of the most enthusiastic people, the current AI system is not worthy of it.”

Although Likone estimates that we are at least a decade away from the global models that he imagines, today’s global models are showing promises as the early physics simulators.

Openi Surah Mine Craft — Controlling a player in the mine craft and presenting the world. **Image Credit:**Open I

Openi notes in a blog that Surah, which he considers to be a global model, can imitate movements like a brush stroke leaving a brush stroke on the canvas. Models like Surah – and self -Surah can also be effectively Unable Video Game. For example, Surah can present a mine craft -like UI and game world.

World Labs co -founder Justin Johnson said on one, saying that future global model gaming, virtual photography, and more may be eligible to create a 3D world on demand. Elect A16z Podcast.

“We already have the ability to create virtual, interactive worlds, but it has hundreds and hundreds of millions of dollars and a ton of development time,” said Johnson. “(Global model) will not only allow you to get one icon or clip, but will also get a complete artificial, dynamic and interactive 3D world.”

High obstacles

While this concept is charming, many technical challenges stand on the way.

Training and running global models still need large -scale computing power compared to the amount used by the production model. Although some of the latest models of language can run on modern smartphones, thousands of GPUs will be needed to train and run Surah (allegedly an early global model), especially if their use becomes commonplace.

Global models, such as all AI models, also deceive – and internal prejudice in their training data. In European cities, sunshine weather videos can struggle to understand or photograph cities in Korean cities, for example, or just do it wrong.

Mishra Boff says the general lack of training data is at risk of raising these issues.

“We have seen that the models are really limited to the specific types of people of a specific type or race,” he said. “The training data for the global model should be quite wide to cover different scenarios, but it is also highly specific where AI can deepen the nuances of these scenarios.”

In a recent PostData and engineering problems prevent today’s models from accurately catching up in the world’s inhabitants (such as humans and animals), says CEO of AI Startup Runway. “Models will need to create permanent maps of the environment, and the ability to visit and communicate in these environments,” he said.

Openi Surah — A video made from a Surah. **Image Credit:**Open I

If all major obstacles have been overcome, however, Mishra Boff believes that the global models can “back” with the real world “more firmly”.

They can also produce a more capable robot.

The robots are limited in their work today because they are not aware of the world (or their own bodies). Global models can give them awareness, said joint Boff – at least at one point.

“With an advanced global model, an AI can develop a personal understanding of any scenario placed in the scenario,” he said. “

Tech Crunch is AI -based newsletter! Sign up here Get every Wednesday in your inbox.

The story was originally published on October 28, 2024, and was updated on December 14, 2024, with new updates about Surah.

Modeling the world

High obstacles

Editor's pick

Get latest news

What are the AI ​​’World Model’, and why does it matter?

Modeling the world

High obstacles

Tech Startups are changing to improve education

Barbara Corcorin’s cute NYC Paint House is for sale

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news

What are the AI ’World Model’, and why does it matter?