×
This startup is revolutionizing 3D content with Meta’s Segment Anything Model
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Common Sense Machines is revolutionizing 3D content creation by leveraging Meta’s Segment Anything Model 2 (SAM 2) to transform 2D images into production-ready 3D assets. This breakthrough addresses a significant challenge in the generative AI landscape, where 3D asset creation has lagged behind 2D generation due to data limitations and multi-view rendering requirements. By drastically reducing production time and democratizing access to 3D modeling, CSM’s technology represents a crucial advancement for game developers, VR experiences, and visual effects industries.

The big picture: Common Sense Machines uses Meta’s open source Segment Anything Model 2 to translate 2D images and videos into fully-functional 3D assets that are ready for production environments.

  • The Cambridge, Massachusetts-based company aims to dramatically accelerate 3D content creation—a process that traditionally takes hours or days for a single asset.
  • CSM’s technology makes professional 3D asset creation accessible even to users without specialized expertise in complex 3D modeling software.

Why this matters: 3D asset generation has faced unique challenges compared to 2D image generation, particularly with limited training data and the requirement to render objects convincingly from all angles.

  • By automating the conversion from 2D to 3D, CSM is significantly streamlining workflows for artists who would otherwise need multiple software tools like Blender, Maya, ZBrush, and Adobe Suite.
  • The technology enables generated 3D assets to be immediately useful for processes like rigging and animation rather than just being static models.

How it works: SAM 2’s advanced segmentation capabilities allow CSM to identify and separate individual elements within images, creating accurate component-based 3D models.

  • Released in July 2024, SAM 2 enables real-time, promptable object segmentation in both images and videos, building on the original SAM model from April 2023.
  • CSM’s software uses these segmentation capabilities alongside text prompts to generate 3D assets with properly defined component parts that are ready for production use.

What they’re saying: CSM’s CEO Tejas Kulkarni credits Meta’s open source approach as critical to the company’s development and competitive ability.

  • “It’s almost like an extended massive research team that we can’t afford,” Kulkarni explains, emphasizing how essential open source access has been for his startup.
  • “Without that, it’s very hard to be competitive, and it’s hard to have the resources to focus on the peripheral things that we don’t have expertise on, like those image segmentation models and video segmentation models.”

Looking ahead: CSM is already working with game developers to accelerate their production workflows and plans to expand its technology to generate entire 3D worlds from 2D images.

  • The company will continue to rely on Meta’s Segment Anything technology as it develops these more ambitious generative 3D capabilities.
  • This collaboration demonstrates how open source AI models can empower smaller companies to innovate in specialized domains like 3D content creation.
How Common Sense Machines uses Meta Segment Anything Model and AI to generate production-ready 3D assets

Recent News

AI won’t create superhuman coders by 2027, experts warn

AI systems will likely take years longer than expected to surpass human programmers, with FutureSearch projecting 2033 versus competing estimates of 2028-2030.

AI agents are transforming hospital operations, including wait times

Autonomous systems are handling everything from patient billing to appointment scheduling, offering 24/7 capabilities while reducing administrative burdens on healthcare workers.

AI challenges cybersecurity and privacy space, “prompting” professionals to keep up

Evolving technologies are outpacing regulatory frameworks, raising unprecedented questions about data protection from facial recognition to neural patterns.