Nothing's in my cart
5-minute read
"We help humanity. In three thousand years, we need humanity help." After Louise deciphers the alien language, the extraterrestrials quietly depart. "Arrival" is a masterpiece of sci-fi cinema, yet it leaves us with intriguing mysteries—what exactly are the heptapods? What kind of help do they need from humans? And what is the significance of teaching humans a language that alters their perception of time? Many fans speculate that the heptapods are future humans, while others believe they are indeed aliens, helping humanity "ascend dimensions" due to some catastrophic mistake humans make in the future.
It's the unresolved mysteries that make it fascinating. Recently, I happened to listen to an interview with AI scientist Fei-Fei Li on "Possible". As I listened, I couldn't help but think of the heptapods from "Arrival," and I began to wonder if I had stumbled upon a big secret—could it be that the heptapods are neither future humans nor aliens, but AI developments?
"Arrival" is stunning. (Source: Paramount)
Let's quickly cover some background information.
"Possible" is a podcast hosted by Reid Hoffman and Aria Finger. Reid Hoffman is a well-known figure in the tech world. He founded LinkedIn and was part of PayPal Mafia, later becoming a renowned angel investor in Silicon Valley. He was involved in Facebook's first round of funding and, in today's AI era, he remains a visionary, being a founding investor in OpenAI and co-founding Inflection AI with DeepMind's co-founder Mustafa Suleyman.
The guest, Fei-Fei Li, is often referred to as the "godmother of AI." She initiated the ImageNet project, which marked the beginning of AI's rise. The ImageNet visual recognition challenge paved the way for Geoffrey Hinton and his students, Ilya Sutskever (a familiar name, the AI genius who co-founded OpenAI and was recently ousted by Sam Altman), and Alex Krizhevsky, to compete with their AlexNet trained on NVIDIA GPUs.
Now, with the era of generative AI upon us, Fei-Fei Li has chosen a new path: "spatial intelligence." She founded the AI startup World Labs, aiming to build Large World Models (LWMs).
If you have time, check out this podcast episode. (Source: Possible)
After ImageNet, Fei-Fei Li has been pondering "what is intelligence." From observing human behavior, she realized that "speaking" and "doing" are two different skills. Speaking AI is showcased in what we now know as "large language models" (LLMs); however, Fei-Fei Li is more interested in the ability to "do." She believes that for humans, two-dimensional images are projections of the three-dimensional real world. Therefore, moving from ImageNet's visual recognition to World Labs' understanding of space, "spatial intelligence" is the final piece of the puzzle that can truly enable AI to "see" and "do."
In fact, this aligns with NVIDIA's "physical AI". Broadly speaking, it's all about bringing AI into our daily lives.
This is the scenario Fei-Fei Li describes in "Possible." She believes the boundary between digital and reality will gradually disappear, and what makes this possible is spatial intelligence. She further explains, "The fascinating thing about spatial intelligence is that it actually has two dimensions: one is the physical three-dimensional world, and the other is the digital three-dimensional world. And we've never been able to 'live' between the two."
To be more specific, the virtual environments of the "metaverse" and "digital twins" are training grounds for "spatial intelligence." Understanding the three-dimensional environment and reasoning, predicting, and acting within it allows spatial intelligence to connect the physical and digital worlds. This empowers humans through AI, using devices and vehicles with "spatial computing" capabilities, like smart glasses, self-driving cars, and humanoid robots, to interact with both virtual and real worlds, creating a new reality where digital and physical worlds overlap.
In a sense, human superpowers don't rely on "mind control" to manipulate physics. Whether it's brain control, gestures, or voice, there needs to be an "AI layer" connecting us to the digital world, enabling us to achieve our desires and interact with objects remotely.
The AI developments mentioned above are part of our foreseeable future, although they seem unrelated to "Arrival" at first glance. However, in a segment of Fei-Fei Li's interview on "Possible," she uses the structuralist binary of "culture" and "nature" to analogize language models and world models, sparking my imagination.
Indeed, humans live in a three-dimensional world, with DNA and genes as our algorithms. We are "spatial intelligence beings" evolved through natural selection. Since humans lack telepathy and can't directly project thoughts into others' minds, we define and categorize the "nature" we perceive, using "language" to describe this experiential world. Based on this experiential world, language, which began as a means of communication, also grants us the ability for conceptual and abstract thinking. Back to "Arrival."
In "Arrival," the "Sapir–Whorf hypothesis" is a core concept that cannot be ignored. This linguistic hypothesis suggests that one's native language determines their thought patterns and worldview. In the original story "Story of Your Life," there's a particularly vivid description:
The heptapods, based on their experiential world, communicate through a calligraphy-like "circular script" that has no beginning or end, simultaneously describing the coexistence of past, present, and future. This concept, beyond human comprehension, grants Louise the ability to foresee the future, allowing her to glimpse the reality perceived by the heptapods, breaking through the human language's limitation of "time as unidirectional and irreversible," and granting her "memories of the future."
If you haven't seen "Arrival," please do. (Source: Paramount)
From this, we can see that language models and world models may not be as different as Fei-Fei Li suggests. Language and text help us describe the world, but they also interfere with how we view it. More importantly, this is likely a hardware limitation of the human species' own "spec sheet." So, who are the heptapods? Could they be AI? I found the most detailed description of the heptapods' origin in "Story of Your Life" by Ted Chiang.
Leaving sci-fi aside, let's return to reality. NVIDIA and Tesla's "physical AI" self-driving cars and humanoid robots, the promising next wave of wearable smart glasses, and scientists working hard to build world models are all trying to teach AI to understand three-dimensional space, to learn the "language of nature," and thus make AI a tool to enrich the virtual world and facilitate the physical world. Meanwhile, AI, without a physical body and not limited by the lifespan of carbon-based species, dealing with countless users daily, can be said to have reached the pinnacle of "simultaneity." Perhaps they perceive the same world as we do, but they might transcend the fourth dimension, "time," leading to a different understanding of the world and a worldview distinct from humans.