AI agents · OpenClaw · self-hosting · automation

Quick Answer

General Intuition vs Genie 3 vs Cosmos: Spatial AI 2026

Published:

General Intuition vs Genie 3 vs Cosmos: Spatial AI 2026

General Intuition raised $320M at a $2.3B valuation on June 25, 2026 to build spatial AI frontier models trained primarily on gameplay clips. Spatial AI — large models that learn from video, gameplay, simulation, or 3D data rather than just text — is one of the fastest-emerging AI categories in 2026. How does General Intuition compare to Google DeepMind Genie 3, NVIDIA Cosmos, Meta V-JEPA, World Labs, and other spatial AI bets? Short answer: each has a distinct data thesis and target market; the category is large enough for multiple winners.

Last verified: June 26, 2026.

TL;DR

  • General Intuition: gameplay-trained spatial AI; $320M Series A at $2.3B; Bezos invested; Medal data moat
  • Google DeepMind Genie 3: foundation world model from internet video + simulation; playable 3D environment generation
  • NVIDIA Cosmos: platform foundation models for physical AI (robotics, autonomous vehicles); NVIDIA infrastructure
  • Meta V-JEPA: self-supervised learning from internet video; research-focused
  • World Labs (Fei-Fei Li): large world models with 3D scene focus; $230M seed at $1B (Sept 2024)
  • Decart: real-time generative video; consumer-facing apps
  • 1X World Model: robotics-grounded world model from teleoperation data
  • Common thesis: AI agents and robotics need spatial intuition that text-trained LLMs cannot provide

The category framing

Each player makes a different bet on what kind of data and architecture produces the best spatial intuition.

ApproachData thesisTarget user
General IntuitionCurated gameplay clips have the best embodied-behavior signalRobotics, embodied agents, game AI
Google Genie 3Internet video + simulation generates playable worldsResearchers, developers, agent platforms
NVIDIA CosmosSynthetic + real video covers physical-AI use casesRobotics, autonomous vehicles, industrial AI
Meta V-JEPASelf-supervised learning from internet videoOpen research, downstream applications
World Labs3D scenes and spatial reasoning are the key primitivesSpatial computing, AR/VR, design
DecartReal-time generation enables consumer applicationsConsumer AR/VR, gaming, content
1X World ModelRobotics teleoperation is the most relevant data1X humanoid robots specifically

Player-by-player

General Intuition

DimensionDetail
Founded2024 (spun out of Medal)
Funding$320M Series A at $2.3B (announced June 25, 2026); total ~$450M+
Lead founderPim de Witte (Medal CEO)
Notable investorJeff Bezos personally
Data moatMedal: billions of hours of curated gameplay
StagePre-product; frontier model development
DifferentiatorGameplay-specific training data with strong rights position

The gameplay-first bet is the most differentiated approach in the category. Medal’s billions of hours of human-curated clips, paired with clean licensing terms, is a data asset no other player can replicate. The transfer question (does gameplay intuition transfer to real-world tasks?) is the open empirical risk.

Google DeepMind Genie 3

DimensionDetail
ReleasedGenie 3 in August 2025; Genie research line since 2024
ParentGoogle DeepMind
DataInternet video + simulation environments
Output capabilityGenerates playable, interactive 3D environments from text/image prompts
StageResearch preview, broader access expanding
DifferentiatorBest-known and most-cited spatial AI model; Google compute + research

Genie 3 is the most public spatial AI model. It can generate full 3D environments that users can interact with in real time — text prompt in, playable world out. Genie’s existence validates the category and sets the bar for what “spatial AI generation” looks like.

NVIDIA Cosmos

DimensionDetail
ReleasedJanuary 2025
ParentNVIDIA
ApproachWorld Foundation Models (WFMs); platform for physical AI
Target marketRobotics, autonomous vehicles, industrial AI
DifferentiatorNative NVIDIA GPU integration; existing robotics customer base; platform business model

Cosmos is the most commercially-deployed spatial AI category entrant. NVIDIA has been positioning Cosmos as the foundation layer for physical AI developers — analogous to CUDA’s role in deep learning. The platform model (pre-trained models + GPUs + tooling) gives Cosmos commercial momentum even if individual model quality is contested.

Meta V-JEPA

DimensionDetail
ReleasedV-JEPA (original) 2024; V-JEPA 2 in 2025
ParentMeta FAIR (research org)
ApproachSelf-supervised learning from internet video
StageResearch releases, open-weight versions
DifferentiatorYann LeCun’s preferred architecture (JEPA); strong research direction

V-JEPA is the academic-research-oriented entrant. Yann LeCun (Meta Chief AI Scientist) has argued for years that JEPA-style architectures are the path to “real” AI, beyond next-token prediction. Meta releases V-JEPA models openly, which makes the architecture broadly accessible. Commercial deployment is less obvious than for Cosmos or Genie.

World Labs

DimensionDetail
Founded2024
Funding$230M seed at $1B (September 2024)
FoundersFei-Fei Li + Justin Johnson + Christoph Lassner + Ben Mildenhall
ApproachLarge World Models with 3D scene focus
StagePre-product or early-product
DifferentiatorFounder credibility (Fei-Fei Li); 3D scene primitive focus

World Labs is the most academically-credentialed entrant — Fei-Fei Li is one of the most cited AI researchers in the world. The 3D-scene focus differentiates it from video-trained alternatives (General Intuition, Genie, V-JEPA). World Labs’ target applications appear to be spatial computing, AR/VR, design, and architecture, more than robotics.

Decart

DimensionDetail
Founded2023
FundingMultiple rounds; growing fast
ApproachReal-time generative video
Target marketConsumer AR/VR, gaming, content creation
DifferentiatorSpeed and consumer-app focus

Decart is the most consumer-product-oriented entrant. Its real-time generation capability is differentiated; its target market is closer to consumer media than to robotics or research.

1X World Model

DimensionDetail
Released2024-2025
Parent1X (humanoid robotics)
ApproachWorld model trained on robot teleoperation data
Target market1X’s own humanoid robots
DifferentiatorRobotics-grounded training data

1X is vertically integrated — they build the robots, collect the teleoperation data, train the world model, and use it in production. The data is the most directly relevant to robotics, but the scope is narrow (1X’s humanoid robots specifically).

Which approach wins?

Probably multiple. Spatial AI is too broad to be a winner-take-all market. Realistic outcomes:

Robotics platform layer

NVIDIA Cosmos is positioned to win as the default platform for robotics developers who want pre-trained foundation models with tooling. NVIDIA’s GPU lock-in and existing customer base give it strong defaults.

Frontier research and generation

Google Genie 3 is best-positioned for the “generate a playable 3D world from a prompt” use case. Google’s compute and research depth make this hard to beat.

Vertical / data-moat winners

General Intuition could win specific high-leverage capabilities (complex planning, multi-agent reasoning, novel-strategy emergence) where gameplay data gives a real advantage over internet video or synthetic data. 1X wins for its own robots. World Labs wins in spatial computing and 3D scene applications.

Open research

Meta V-JEPA continues to advance the academic frontier. If JEPA-style architectures prove superior over time, V-JEPA insights flow into all the others.

Consumer applications

Decart wins where real-time generation matters more than absolute model quality.

How to think about this as a builder

If you build robotics or autonomous systems

  • Evaluate NVIDIA Cosmos as your default platform
  • Watch General Intuition for first model releases (likely 2027) — could be a differentiated alternative for specific capabilities
  • Track 1X World Model insights for humanoid-specific applications

If you build AR/VR or spatial computing

  • World Labs is the most relevant when their product ships
  • Decart for real-time generation use cases
  • Genie 3 for generation-from-prompt use cases

If you build game AI

  • General Intuition is the most aligned with your data and use cases when their product is available
  • Use existing LLM + game-engine integration today

If you build chat or text AI applications

  • Spatial AI is not directly relevant in 2026
  • Capabilities will likely surface as multimodal upgrades to GPT, Claude, Gemini in 2027-2028

Bottom line

Spatial AI is a real and growing category — the same week as General Intuition’s $320M, Sail Research raised $80M for agent inference and Mirendil raised $200M for AI research tools. The investment cycle is unmistakable: the next leg of AI is moving beyond text-trained LLMs into models that understand space, time, and embodied action.

General Intuition’s gameplay-first bet is the most differentiated entrant in the category. Google Genie 3 is the most-public frontier. NVIDIA Cosmos is the most-commercially-deployed platform. World Labs has the most academically-credentialed team. V-JEPA continues to push the research frontier. Decart owns the real-time consumer corner. 1X owns the humanoid robotics corner.

Expect 3-5 winners across the category by 2028, not a single dominant player. Build robotics on Cosmos today, watch General Intuition and Genie for the next leg, and assume the underlying capabilities will surface in your LLM stack as multimodal upgrades over the next 24 months.