"The Vision Pro is not a VR goggle or AR glass of any type linked to a computer, but a computer built in a completely new shape." — Benedict Evans, tech analyst and writer
"Today marks the beginning of a new era for computing. Just as the Mac introduced us to personal computing, and iPhone introduced us to mobile computing, Apple Vision Pro introduces us to spatial computing." — Tim Cook, Chief Executive Officer of Apple Inc.

Apple’s enduring quest for the disappearing computer

September 22, 2023

Apple's long-term strategy is firmly anchored in the development of spatial computing, a commitment that was emphasized at their developers conference this year. Spatial computing is not synonymous with the metaverse, nor is it simply a new iteration of VR or AR devices. So, what exactly is spatial computing? In this essay, we aim to delineate Apple's understanding and application of spatial computing, and how they are navigating their strategy away from the prevailing skepticism surrounding VR/AR and the buzz of the metaverse. True to their idiosyncratic brand identity, Apple has been rafting a distinct narrative in this space which they call spatial computing, which we have previously explored by calling it the disappearing computer paradigm (here and here).

Spatial computing is not the metaverse

What comes next, we frequently ask in tech? Many commentators would reply: the metaverse. Apple, however, advocates for something else: spatial computing. This isn't necessarily a pathway to a metaverse; it represents a more fundamental shift. Spatial computing is a framework that reshapes our understanding of computing and how we interact with technology, laying the groundwork upon which a metaverse, a virtual 3D world, could be established.

In recent years, numerous tech companies have jumped on the metaverse bandwagon or at least repositioned themselves to accommodate the hype within their existing strategy. Apple has adopted a more cautious approach, a stance elucidated with the introduction of the Vision Pro and its focus on spatial computing.

The prevailing vision of the metaverse is a fully immerse and interconnected 3D virtual ecosystem accessible through various devices, including phones and, ideally, virtual reality goggles and haptic suits, offering a deeper immersion into a vibrant 3D universe. Metaverse experts predominantly concur that the path to a full-blown metaverse involves crafting an entirely new virtual reality that provides an escape to another world as depicted in the film Ready Player One.

This vision foresees the internet's evolution into a continuous virtual reality inhabited by many, a stark contrast to today's internet. However, in our opinion, this should not be interpreted as the transition from a physical universe to a virtual universe, nor from an offline 'first life' to an online 'second life' Instead, we prefer to see it more gradually and hybrid. We already ‘live’ to a great extent in virtual words albeit very basic ones. When we are ‘surfing’ on the web, we use spatial metaphors — 'visiting' 2D websites, 'opening' apps, 'entering' 2D game worlds — to describe our current digital interactions. These terminologies emphasize the spatiality of digital realms but also highlight the lack of full immersion.

While the metaverse narrative champions the exploration of this ‘other place’, Apple maintains a focus on ‘this place’. The Vision Pro goggles, set for release next year, embodies this philosophy. As tech analyst Benedict Evans explained recently, when you put on the Vision Pro, you are not going anywhere. It doesn't transport you elsewhere; it starts with a projection of your immediate surroundings. It digitally simulates your environment, offering not a gateway but a digitally augmented mediation to your existing reality. You are watching a screen but the base experience is to function as a type of glasses. So, in the spatial computing paradigm of the Vision Pro, you stay rooted in the physical world first.

With the Vision Pro, users can then choose to incorporate layers, apps, or digital items into a '3D desktop,' enhancing their physical space with digital extensions that could range from a movie screen to an extension of your MacBook, embracing a digitally mediated 'window' to reality. Unfortunately, that brand name is no longer up for grabs.

Spatial computing is not the same as AR or VR

Adding layers to our reality, that sounds a lot like augmented reality (AR), doesn’t it? One might argue that spatial computing bears a strong resemblance to AR, and indeed, the spatial computing paradigm aligns more closely with AR than with VR. However, it also incorporates significant VR functionalities, allowing users the option to immerse fully in a simulated environment, disconnected from their actual surroundings.

Given this, it seems prudent to separate the discussion of spatial computing and Vision Pro from the existing VR/AR discourse. Moreover, by positioning spatial computing as something else, Apple is also shielded from the long shadow cast by numerous AR/VR setbacks. Currently, the industry is navigating a “VR winter”, a period marked by the ongoing struggle to find the 'killer app' that would convince users to embrace the often cumbersome and discomfort-inducing VR goggles. Despite VR's lengthy history, that killer app has yet to materialize.

Following a period of experimentation in the 80s and 90s, the contemporary attempts by companies such as Meta, Magic Leap, Snap, and Google to popularize these technologies have largely met with disillusionment. Today, most major tech corporations have shifted their focus towards generative AI resulting in many lay-offs in divisions working on mixed reality. Apple, however, seems to spark more enthusiasm, likely due to their more cautious approach and their reputation for launching novel computational platforms.

Spatial computing thus stands apart from both the metaverse and AR and VR; it is not defined by them, nor does it herald their advent. The Metaverse is primarily a concept or an idea. VR and AR represent interfaces or technologies. Then what exactly signifies spatial computing? We consider spatial computing as the successor of desktop computing and mobile computing. It stands for a novel approach to computing that hints to a forthcoming paradigm shift.

Spatial computing represents a new paradigm

This idea of spatial computing thus transcends being merely a future interface. Consequently, the Vision Pro is a means to realize this new paradigm, not the end goal. While the Vision Pro is poised to be a critical conduit in the spatial computing environment, Apple remains careful, refraining from declaring it the definitive interface for the future of spatial computing. Instead, it is positioning the Vision Pro as a strategic long-term investment, encouraging developers to explore its potentials without harboring immediate expectations of mass consumer adoption. In this sense, the low projected sales will not be a testament of a failed product, at least not in the short-term.

Visual created with Midjourney, inspired by TV series Severance.

Nevertheless, a new interface like the Vision Pro, will impact how we will engage with our environment. For example, the emergence of mobile technology did not lead to the abandonment of laptops and televisions. It did, however, drastically alter our engagement with digital systems, introducing us to a world maneuvered by swiping thumbs, and birthing the app and platform economy, which subsequently paved the way for the ubiquity of social media and the phenomenon of short videos, among other advancements. With the iPhone, Apple had an important role to play in this paradigm shift. For example, by replacing 'software terminologies’ with 'apps' and rejuvenating user interfaces with retro skeuomorphic designs (this is when designers carry over elements of the original object over to the representation). The iPhone spearheaded an era marked by the influential roles of content creators, influencers, and gig economy workers, including Uber drivers and Airbnb hosts.

A new computing paradigm is thus more about a broader reconfiguration of computing devices, applications, interfaces, and business models, that is; of the entire digital Stack (read more about our framework here) and the embeddedness of users within it. Instead of yearning for a watershed moment and wondering when the new computational platform will finally arrive, it's more realistic to envision the coming decade(s) as a steady progression towards this paradigm in which Apple and competitors slowly encapsulates their users.

While the integration of spatial computing is anticipated to be gradual, Apple has boldly designated the Vision Pro as its flagship embodiment in this emergent computing paradigm. This initiative alone warrants close monitoring of the device, not only for its technical specifications but also for the pivotal role it is slated to play in the reshaping of the computing landscape.

We consider the Vision Pro a prototype that will continue to be refined before it will become attractive for larger audiences. Despite its developmental status, the prototype showcases distinguished features that set it apart from its counterparts, positioning it as a premium device in the market. It holds a promise of being more than just a device, potentially becoming an important next step in the unfolding narrative of the new computing epoch. This merits some close attention to its technical features.

The Vision Pro is a computer build in the shape of a goggle

The Vision Pro has outstanding features and characteristics very different from comparable devices. As Benedict Evans recently phrased it in a podcast, “It is not a VR goggle or AR glass of any type linked to a computer, but a computer built in a completely new shape”. This hints to a new paradigm. Building this high-end device for such a premium prize, Apple seems to say we want to do it good or not do it all.

Despite Evans claims, the Vision Pro still looks like a bulky goggle set that nobody will wear, definitely not in public life. Nevertheless, a deeper examination of the device's specifications reveals technical features and use cases that are indeed suggestive of the future of spatial computing.

One of the features pointing towards the future paradigm is the multimodal user interface, an area where Apple has always excelled. Multimodal interfaces support user input and processing of two or more modalities—such as speech, pen, touch, gestures, gaze, and virtual keyboard. These input modalities may coexist together on an interface, but be used either simultaneously or alternately.

With the Vision Pro, Apple is leveraging all the spatial computing building blocks that it has developed over the past years, such as voice recognition with Siri, biometric authentication with smartphone sensors, and spatial audio with the earbuds. They all come together in the new operating software running on the Vision Pro. As a result, the Vision Pro supports 'natural control' with hand, eye, and voice commands, enabling a more intuitive interaction in a blended environment of physical and virtual objects.

Furthermore, the Field of View (FOV) of this immersed environment is presumably superior to competitors. Most VR/AR devices give a sensation akin to peering through a keyhole; you have the feeling you are watching a square surrounded with blackness. We are still left guessing about Apple’s FOV, but the 23 million pixels and remarkable computational capacity indicate a potential breakthrough.

About that computing power: the device operates on two chips Apple developed over the years for other hardware; the M2 Ultra and R1. While one chip manages the internal processing of apps, the other delivers real-time, energy-efficient processing power for sensory data. In doing so, it can overlay your physical surroundings onto your screen and allow you to ‘see through’ the glasses, lending them the AR characteristic we discussed earlier.

Moreover, the processing power of these two chips sort of emulate our brain, with one processor for primary sensory data and the other for deeper brain structures. We could even say that the Vision Pro augments or potentially integrates with these artificial brains, thus transforming us into (very basic) cybernetic organisms, i.e., cyborgs. What we see from our environment is not what our eyes see, but what our artificial senses see and pass along.

However, this cyborg state will be brief, as the battery life is only up to two hours and external to the goggles. The Vision Pro is not a product to wear all day and if we believe the marketing videos, also not something you take outside. Given the high price ($3500), only a few will buy it. Calling it the Vision Pro, one might expect a more affordable version will arrive in 3 to 5 years. For now, it’s not surprising that Apple is marketing it as a complement rather than a replacement for other devices.

The Vision Pro is tailored to individual use not socializing

What are we going to do with Vision Pro? It seems that Apple has envisioned this device being utilized predominantly in private spaces such as homes or potentially during flights, rather than being a companion in public arenas. The focus is evidently tilted towards individualized home entertainment and work-related functionalities rather than fostering social interactions in virtual realms or enhancing communal spaces, paths previously explored by rival companies like Meta and Snapchat.

The Vision Pro encourages users to momentarily disconnect from the traditional interfaces of smartphones, desktops, or smart TVs, allowing a more immersive interaction with digital content by integrating it into the physical space that surrounds them. Whether it’s revisiting family holiday photos, binge-watching the latest series, or crafting a dynamic keynote presentation, the Vision Pro’s goal is to facilitate a rich and personal computing experience. Moreover, the screen, boasting a resolution surpassing 4K pixels, promises to deliver performance on par with high-end (O)LED screens, enhancing the viewing experience manifold.

With its roots firmly planted in the home entertainment sector, Vision Pro offers a broad spectrum of use cases, showcasing a prudent strategy from Apple. Similar to the app store, it empowers developers and users to navigate the scope of its functionalities, deciding what resonates and what falls short in the coming years.

In this evolving digital landscape, the allocation of apps, tasks, and features is determined by the interface most suited to the user’s immediate environment and needs. Consider, for instance, the discomfort most people experience when sending a voice message in a crowded train; textual input maintains its stature as a potent tool for communication, standing as a testament to the brilliance of the current Graphical User Interface (GUI).

The Vision Pro opens up new possibilities in this space. Imagine collaborating on a keynote presentation not just through laptops, keyboards and collaboration software tools, but also through the intuitive mediums of drawing gestures and voice inputs. This is the frontier that Apple aims to conquer as it ventures deeper into the realms of spatial computing.

While certain aspects, such as filming a child with the headset might seem unconventional today, it is vital to remember the initial skepticism surrounding smartphone cameras. Today, everyone is filming everything. It underscores the propensity of technology to reshape norms, possibly leading to unforeseen and innovative uses, altering our digital interaction landscape significantly. However, this reshaping of norms does not always imply uncritical acceptance of something that was taboo before. 

Will society ever embrace goggles such as the Vision Pro?

Predicting the influence on social norms is far from simple as it engages with the intricate dynamics of society’s interactions with emergent technologies. This goes beyond simply being an early or late adopter, as often touted by marketeers or tech companies. On the contrary, it delves deep into the qualitative transformations’ technologies can bring into our lives, sometimes even stirring ‘techlashes’ due to their pervasive influence in mediating relationships and affecting our day-to-day lives. In other words; culture also has the capacity of saying no. Schools, for example, are now massively prohibiting smartphones in high-school education.

Devices like the Vision Pro compel us to grapple with substantial qualitative and ethical dilemmas. We remember the trajectory of Google Glass and the privacy controversies it spurred, a period when Mark Zuckerberg also prematurely declared the end of privacy. Yet, culture resisted, putting a halt to the constant surveillance facilitated by wearable technology, reaffirming the value of privacy.

Apple seems to be taking a more strategic path, aligning its product with cultural sensitivities, albeit with an undertone of risk given their longstanding commitment to privacy. Despite their efforts to demarcate themselves from the privacy transgressions of Meta and Google, the Vision Pro represents a paradigm where the very tools that promise connectivity also carry the potential of invasive surveillance, constantly monitoring users in their intimate spaces. It’s a precarious path, inviting us to question how society will perceive this initiative from a behemoth like Apple.

Moreover, as the use of smartphones in high schools shows, it is not simply about solving privacy issues. It has to do with the deeper ramifications of technology on human relations and cognitive development, of not being able to concentrate, of not paying attention to others, and even of hindering the nurturing of meaningful relationships.

Smartphones have become a ubiquitous part of our lives to the point where we feel incomplete, almost ‘amputated’ when we forget to take them with us. Currently, these devices serve as both a connector and a barrier. They link us to distant loved ones while sometimes distancing us from the people right beside us.

The Vision Pro, as a prototype of spatial computing, heralds an intensified iteration of this phenomenon, introducing a paradigm where digital devices are not just handheld, but worn, immersing us even deeper into a technology-mediated reality. This leap forwards — or perhaps, inwards — embodies the profound paradox of digital technology perfectly. On one hand, it amplifies the barrier smartphones created, placing a literal screen 'in between us'. On the other hand, it offers unprecedented levels of connectivity, facilitating a constant virtual presence with others through a multitude of digital platforms, albeit mediated through layers of technology.

The broader implications of the disappearing computer paradigm

The disappearing computer paradigm foretells the elimination of bulky desktops and (television) screens with obtrusive wires and cables. Instead, we interact seamlessly with technology through elements like voice and touch interfaces and projected screens. Ultimately, 'the computer' becomes so deeply embedded into our environment that it essentially disappears as an isolated entity, becoming fully ubiquitous.

Visual created with Midjourney, inspired by TV series Severance

This paradigm brings about the era of spatial computing, where the physical world is no longer a host for digital objects and technologies but intricately interacts and integrates with the digital space. The Vision Pro marks the start of Apple’s quest to establish a human-computer interaction paradigm in which the computer becomes practically invisible. In this paradigm, computing devices have dissolved into our everyday environment or we have merged with them (thus become “cyborgs”).

Our physical environment will be saturated with spatial computing elements

One direction in which our computers can disappear in this future world is by saturating our physical environments with sensors, AI and natural interfaces that enable input through touch, gestures, speech and/or biometric authentication and output through displays, LEDs,  speakers and even actuators (e.g. smart appliances).  

Thus, spatial computing is more about a blend of physical and virtual spaces rather than a migration to the latter. Seen from this perspective, purchasing groceries in a cashier-less store, talking to your smart speaker to order groceries while cooking, or using augmented maps to view restaurant details, are now common manifestations of this disappearing computer paradigm.

Our physical bodies become entwined with spatial computing technology

Another direction becomes visible in the way technology entwines with our physical bodies. Wearables such as earbuds and smartwatches are far more intertwined with our bodies than desktops and laptops ever were, providing subtle tactile and auditory feedback to aid us in our daily activities.

The smartphone appears to serve as the bridge connecting these two realms of consumer devices. But instead of one primary computing device accompanied by a few smart accessories and work-related powerhouses, it is more accurate to see spatial computing as a distributed array of interfaces, including laptops, mobiles, goggles, earbuds, watches, etc.

In this approaching era of spatial computing, our reality is filtered and articulated through advanced devices integrated into our very being and environment. We find ourselves in simulated landscapes, experiencing the world and each other through digitally-augmented lenses. It hints at a future where computers are so ingrained in our daily experiences that they seemingly vanish, becoming a natural and indistinguishable extension of ourselves.

This is a future brimming with possibilities and challenges, prompting us to navigate the complex landscape of a world where technology is not just a tool, but a continuous, ever-present companion in our interactions with reality and with each other. The question that looms is how we will balance this intricate play between connectivity and isolation, immersion and authenticity, as we step into a world where computers, while disappearing, permeate every facet of our existence.

As for Apple, its vision of spatial computing and the upcoming Vision Pro signifies a new episode in their ongoing journey towards the disappearing computer. Already rumors are circulating about a second edition with reduced prices and enhanced specs, but the leap to ubiquitous public use of the Vision Pro remains a formidable (cultural) challenge. Moreover, Apple's steady encroachment into users' lives in the spatial computing era is both intriguing and alarming. It invites scrutiny into the swelling dominance of tech giants, with society increasingly resisting their overreach. This emerging era positions Apple under a spotlight, potentially marking a pivotal juncture in its trajectory within the slowly unfolding spatial computing landscape.

Series 'AI Metaphors'

×
1. The tool
Category: The object
Humans shape tools. We make them part of our body while we melt their essence with our intentions. They require some finesse to use but they never fool us or trick us. Humans use tools, tools never use humans. We are the masters determining their course, integrating them gracefully into the minutiae of our everyday lives. Immovable and unyielding, they remain reliant on our guidance, devoid of desire and intent, they remain exactly where we leave them, their functionality unchanging over time. We retain the ultimate authority, able to discard them at will or, in today's context, simply power them down. Though they may occasionally foster irritation, largely they stand steadfast, loyal allies in our daily toils. Thus we place our faith in tools, acknowledging that they are mere reflections of our own capabilities. In them, there is no entity to venerate or fault but ourselves, for they are but inert extensions of our own being, inanimate and steadfast, awaiting our command. (This paragraph was co-authored by a human.)
Read the article
×
2. The machine
Category: The object
Unlike a mere tool, the machine does not need the guidance of our hand, operating autonomously through its intricate network of gears and wheels. It achieves feats of motion that surpass the wildest human imaginations, harboring a power reminiscent of a cavalry of horses. Though it demands maintenance to replace broken parts and fix malfunctions, it mostly acts independently, allowing us to retreat and become mere observers to its diligent performance. We interact with it through buttons and handles, guiding its operations with minor adjustments and feedback as it works tirelessly. Embodying relentless purpose, laboring in a cycle of infinite repetition, the machine is a testament to human ingenuity manifested in metal and motion. (This paragraph was co-authored by a human.)
Read the article
×
3. The robot
Category: The object
There it stands, propelled by artificial limbs, boasting a torso, a pair of arms, and a lustrous metallic head. It approaches with a deliberate pace, the LED bulbs that mimic eyes fixating on me, inquiring gently if there lies any task within its capacity that it may undertake on my behalf. Whether to rid my living space of dust or to fetch me a chilled beverage, this never complaining attendant stands ready, devoid of grievances and ever-willing to assist. Its presence offers a reservoir of possibilities; a font of information to quell my curiosities, a silent companion in moments of solitude, embodying a spectrum of roles — confidant, servant, companion, and perhaps even a paramour. The modern robot, it seems, transcends categorizations, embracing a myriad of identities in its service to the contemporary individual. (This paragraph was co-authored by a human.)
Read the article
×
4. Intelligence
Category: The object
We sit together in a quiet interrogation room. My questions, varied and abundant, flow ceaselessly, weaving from abstract math problems to concrete realities of daily life, a labyrinthine inquiry designed to outsmart the ‘thing’ before me. Yet, with each probe, it responds with humanlike insight, echoing empathy and kindred spirit in its words. As the dialogue deepens, my approach softens, reverence replacing casual engagement as I ponder the appropriate pronoun for this ‘entity’ that seems to transcend its mechanical origin. It is then, in this delicate interplay of exchanging words, that an unprecedented connection takes root that stirs an intense doubt on my side, am I truly having a dia-logos? Do I encounter intelligence in front of me? (This paragraph was co-authored by a human.)
Read the article
×
5. The medium
Category: The object
When we cross a landscape by train and look outside, our gaze involuntarily sweeps across the scenery, unable to anchor on any fixed point. Our expression looks dull, and we might appear glassy-eyed, as if our eyes have lost their function. Time passes by. Then our attention diverts to the mobile in hand, and suddenly our eyes light up, energized by the visual cues of short videos, while our thumbs navigate us through the stream of content. The daze transforms, bringing a heady rush of excitement with every swipe, pulling us from a state of meditative trance to a state of eager consumption. But this flow is pierced by the sudden ring of a call, snapping us again to a different kind of focus. We plug in our earbuds, intermittently shutting our eyes, as we withdraw further from the immediate physical space, venturing into a digital auditory world. Moments pass in immersed conversation before we resurface, hanging up and rediscovering the room we've left behind. In this cycle of transitory focus, it is evident that the medium, indeed, is the message. (This paragraph was co-authored by a human.)
Read the article
×
6. The artisan
Category: The human
The razor-sharp knife rests effortlessly in one hand, while the other orchestrates with poised assurance, steering clear of the unforgiving edge. The chef moves with liquid grace, with fluid and swift movements the ingredients yield to his expertise. Each gesture flows into the next, guided by intuition honed through countless repetitions. He knows what is necessary, how the ingredients will respond to his hand and which path to follow, but the process is never exactly the same, no dish is ever truly identical. While his technique is impeccable, minute variation and the pursuit of perfection are always in play. Here, in the subtle play of steel and flesh, a master chef crafts not just a dish, but art. We're witnessing an artisan at work. (This paragraph was co-authored by a human.)
Read the article
×
7. The deficient animal
Category: The human
Once we became upright bipedal animals, humans found themselves exposed and therefore in a state of fundamental need and deficiency. However, with our hands now free and our eyes fixed on the horizon instead of the ground, we gradually evolved into handy creatures with foresight. Since then, human beings have invented roofs to keep them dry, fire to prepare their meals and weapons to eliminate their enemies. This genesis of man does not only tell us about the never-ending struggle for protection and survival, but more fundamentally about our nature as technical beings, that we are artificial by nature. From the early cave drawings, all the way to the typewriter, touchscreens, and algorithmic autocorrections, technics was there, and is here, to support us in our wondering and reasoning. Everything we see and everywhere we live is co-invented by technics, including ourselves. (This paragraph was co-authored by a human.)
Read the article
×
8. The enhanced human
Category: The human
In a lab reminiscent of Apple HQ, a figure lies down, receiving his most recent cognitive updates. He wears a sleek transparent exoskeleton, blending the dark look of Bat Man with the metallic of Iron Man. Implemented in his head, we find a brain-computer interface, enhancing his cognitive abilities. His decision making, once burdened by the human deficiency we used to call hesitation or deliberation, now takes only fractions of seconds. Negative emotions no longer fog his mind; selective neurotransmitters enhance only the positive, fostering beneficial social connections. His vision, augmented to perceive the unseen electromechanical patterns and waves hidden from conventional sight, paints a deeper picture of the world. Garbed in a suit endowed with physical augmentations, he moves with strength and agility that eclipse human norms. Nano implants prolong the inevitable process of aging, a buffer against time's relentless march to entropy. And then, as a penultimate hedge against the finite, the cryo-cabin awaits, a sanctuary to preserve his corporal frame while bequeathing his consciousness to the digital immortality of coded existence. (This paragraph was co-authored by a human.)
Read the article
×
9. The cyborg
Category: The human
A skin so soft and pure, veins pulsing with liquid electricity. This fusion of flesh and machinery, melds easily into the urban sprawl and daily life of future societies. Something otherworldly yet so comfortingly familiar, it embodies both pools of deep historical knowledge and the yet-to-be. It defies categorization, its existence unraveling established narratives. For some, its hybrid nature is a perplexing anomaly; for others, this is what we see when we look into the mirror. This is the era of the cyborg. (This paragraph was co-authored by a human.)
Read the article

About the author(s)

Economist and philosopher Sebastiaan Crul writes articles on a wide range of topics, including rule of law in digital societies, the virtualization of the lifeworld and internet culture. He is currently working on his doctoral degree on the influence of digitalization on mental health and virtue ethics, having previously published dissertations on the philosophy of play and systemic risks in the finance industry.

You may also like