General Intuition vs Genie 3 vs Cosmos: Spatial AI 2026
General Intuition vs Genie 3 vs Cosmos: Spatial AI 2026
General Intuition raised $320M at a $2.3B valuation on June 25, 2026 to build spatial AI frontier models trained primarily on gameplay clips. Spatial AI — large models that learn from video, gameplay, simulation, or 3D data rather than just text — is one of the fastest-emerging AI categories in 2026. How does General Intuition compare to Google DeepMind Genie 3, NVIDIA Cosmos, Meta V-JEPA, World Labs, and other spatial AI bets? Short answer: each has a distinct data thesis and target market; the category is large enough for multiple winners.
Last verified: June 26, 2026.
TL;DR
- General Intuition: gameplay-trained spatial AI; $320M Series A at $2.3B; Bezos invested; Medal data moat
- Google DeepMind Genie 3: foundation world model from internet video + simulation; playable 3D environment generation
- NVIDIA Cosmos: platform foundation models for physical AI (robotics, autonomous vehicles); NVIDIA infrastructure
- Meta V-JEPA: self-supervised learning from internet video; research-focused
- World Labs (Fei-Fei Li): large world models with 3D scene focus; $230M seed at $1B (Sept 2024)
- Decart: real-time generative video; consumer-facing apps
- 1X World Model: robotics-grounded world model from teleoperation data
- Common thesis: AI agents and robotics need spatial intuition that text-trained LLMs cannot provide
The category framing
Each player makes a different bet on what kind of data and architecture produces the best spatial intuition.
| Approach | Data thesis | Target user |
|---|---|---|
| General Intuition | Curated gameplay clips have the best embodied-behavior signal | Robotics, embodied agents, game AI |
| Google Genie 3 | Internet video + simulation generates playable worlds | Researchers, developers, agent platforms |
| NVIDIA Cosmos | Synthetic + real video covers physical-AI use cases | Robotics, autonomous vehicles, industrial AI |
| Meta V-JEPA | Self-supervised learning from internet video | Open research, downstream applications |
| World Labs | 3D scenes and spatial reasoning are the key primitives | Spatial computing, AR/VR, design |
| Decart | Real-time generation enables consumer applications | Consumer AR/VR, gaming, content |
| 1X World Model | Robotics teleoperation is the most relevant data | 1X humanoid robots specifically |
Player-by-player
General Intuition
| Dimension | Detail |
|---|---|
| Founded | 2024 (spun out of Medal) |
| Funding | $320M Series A at $2.3B (announced June 25, 2026); total ~$450M+ |
| Lead founder | Pim de Witte (Medal CEO) |
| Notable investor | Jeff Bezos personally |
| Data moat | Medal: billions of hours of curated gameplay |
| Stage | Pre-product; frontier model development |
| Differentiator | Gameplay-specific training data with strong rights position |
The gameplay-first bet is the most differentiated approach in the category. Medal’s billions of hours of human-curated clips, paired with clean licensing terms, is a data asset no other player can replicate. The transfer question (does gameplay intuition transfer to real-world tasks?) is the open empirical risk.
Google DeepMind Genie 3
| Dimension | Detail |
|---|---|
| Released | Genie 3 in August 2025; Genie research line since 2024 |
| Parent | Google DeepMind |
| Data | Internet video + simulation environments |
| Output capability | Generates playable, interactive 3D environments from text/image prompts |
| Stage | Research preview, broader access expanding |
| Differentiator | Best-known and most-cited spatial AI model; Google compute + research |
Genie 3 is the most public spatial AI model. It can generate full 3D environments that users can interact with in real time — text prompt in, playable world out. Genie’s existence validates the category and sets the bar for what “spatial AI generation” looks like.
NVIDIA Cosmos
| Dimension | Detail |
|---|---|
| Released | January 2025 |
| Parent | NVIDIA |
| Approach | World Foundation Models (WFMs); platform for physical AI |
| Target market | Robotics, autonomous vehicles, industrial AI |
| Differentiator | Native NVIDIA GPU integration; existing robotics customer base; platform business model |
Cosmos is the most commercially-deployed spatial AI category entrant. NVIDIA has been positioning Cosmos as the foundation layer for physical AI developers — analogous to CUDA’s role in deep learning. The platform model (pre-trained models + GPUs + tooling) gives Cosmos commercial momentum even if individual model quality is contested.
Meta V-JEPA
| Dimension | Detail |
|---|---|
| Released | V-JEPA (original) 2024; V-JEPA 2 in 2025 |
| Parent | Meta FAIR (research org) |
| Approach | Self-supervised learning from internet video |
| Stage | Research releases, open-weight versions |
| Differentiator | Yann LeCun’s preferred architecture (JEPA); strong research direction |
V-JEPA is the academic-research-oriented entrant. Yann LeCun (Meta Chief AI Scientist) has argued for years that JEPA-style architectures are the path to “real” AI, beyond next-token prediction. Meta releases V-JEPA models openly, which makes the architecture broadly accessible. Commercial deployment is less obvious than for Cosmos or Genie.
World Labs
| Dimension | Detail |
|---|---|
| Founded | 2024 |
| Funding | $230M seed at $1B (September 2024) |
| Founders | Fei-Fei Li + Justin Johnson + Christoph Lassner + Ben Mildenhall |
| Approach | Large World Models with 3D scene focus |
| Stage | Pre-product or early-product |
| Differentiator | Founder credibility (Fei-Fei Li); 3D scene primitive focus |
World Labs is the most academically-credentialed entrant — Fei-Fei Li is one of the most cited AI researchers in the world. The 3D-scene focus differentiates it from video-trained alternatives (General Intuition, Genie, V-JEPA). World Labs’ target applications appear to be spatial computing, AR/VR, design, and architecture, more than robotics.
Decart
| Dimension | Detail |
|---|---|
| Founded | 2023 |
| Funding | Multiple rounds; growing fast |
| Approach | Real-time generative video |
| Target market | Consumer AR/VR, gaming, content creation |
| Differentiator | Speed and consumer-app focus |
Decart is the most consumer-product-oriented entrant. Its real-time generation capability is differentiated; its target market is closer to consumer media than to robotics or research.
1X World Model
| Dimension | Detail |
|---|---|
| Released | 2024-2025 |
| Parent | 1X (humanoid robotics) |
| Approach | World model trained on robot teleoperation data |
| Target market | 1X’s own humanoid robots |
| Differentiator | Robotics-grounded training data |
1X is vertically integrated — they build the robots, collect the teleoperation data, train the world model, and use it in production. The data is the most directly relevant to robotics, but the scope is narrow (1X’s humanoid robots specifically).
Which approach wins?
Probably multiple. Spatial AI is too broad to be a winner-take-all market. Realistic outcomes:
Robotics platform layer
NVIDIA Cosmos is positioned to win as the default platform for robotics developers who want pre-trained foundation models with tooling. NVIDIA’s GPU lock-in and existing customer base give it strong defaults.
Frontier research and generation
Google Genie 3 is best-positioned for the “generate a playable 3D world from a prompt” use case. Google’s compute and research depth make this hard to beat.
Vertical / data-moat winners
General Intuition could win specific high-leverage capabilities (complex planning, multi-agent reasoning, novel-strategy emergence) where gameplay data gives a real advantage over internet video or synthetic data. 1X wins for its own robots. World Labs wins in spatial computing and 3D scene applications.
Open research
Meta V-JEPA continues to advance the academic frontier. If JEPA-style architectures prove superior over time, V-JEPA insights flow into all the others.
Consumer applications
Decart wins where real-time generation matters more than absolute model quality.
How to think about this as a builder
If you build robotics or autonomous systems
- Evaluate NVIDIA Cosmos as your default platform
- Watch General Intuition for first model releases (likely 2027) — could be a differentiated alternative for specific capabilities
- Track 1X World Model insights for humanoid-specific applications
If you build AR/VR or spatial computing
- World Labs is the most relevant when their product ships
- Decart for real-time generation use cases
- Genie 3 for generation-from-prompt use cases
If you build game AI
- General Intuition is the most aligned with your data and use cases when their product is available
- Use existing LLM + game-engine integration today
If you build chat or text AI applications
- Spatial AI is not directly relevant in 2026
- Capabilities will likely surface as multimodal upgrades to GPT, Claude, Gemini in 2027-2028
Bottom line
Spatial AI is a real and growing category — the same week as General Intuition’s $320M, Sail Research raised $80M for agent inference and Mirendil raised $200M for AI research tools. The investment cycle is unmistakable: the next leg of AI is moving beyond text-trained LLMs into models that understand space, time, and embodied action.
General Intuition’s gameplay-first bet is the most differentiated entrant in the category. Google Genie 3 is the most-public frontier. NVIDIA Cosmos is the most-commercially-deployed platform. World Labs has the most academically-credentialed team. V-JEPA continues to push the research frontier. Decart owns the real-time consumer corner. 1X owns the humanoid robotics corner.
Expect 3-5 winners across the category by 2028, not a single dominant player. Build robotics on Cosmos today, watch General Intuition and Genie for the next leg, and assume the underlying capabilities will surface in your LLM stack as multimodal upgrades over the next 24 months.