Please Select Your Location
Australia
Österreich
België
Canada
Canada - Français
中国
Česká republika
Denmark
Deutschland
France
HongKong
Iceland
Ireland
Italia
日本
Korea
Latvija
Lietuva
Lëtzebuerg
Malta
المملكة العربية السعودية (Arabic)
Nederland
New Zealand
Norge
Polska
Portugal
Russia
Saudi Arabia
Southeast Asia
España
Suisse
Suomi
Sverige
台灣
Ukraine
United Kingdom
United States
Please Select Your Location
België
Česká republika
Denmark
Iceland
Ireland
Italia
Latvija
Lietuva
Lëtzebuerg
Malta
Nederland
Norge
Polska
Portugal
España
Suisse
Suomi
Sverige
<< Back to Blog

Stability AI Releases TripoSR: An AI Model for Transforming Images into 3D Models

VIVE POST-WAVE Team • April 8, 2024

2-minute read

The evolution of generative AI applications, including AI 3D model generator, is a feast for the eyes. From OpenAI Sora's hyper-realistic videos to the recently introduced Google Genie's AI-generated game graphics, the generative AI field is now venturing into a domain closely tied to VR technology—transforming 2D images into 3D objects with the ability to convert 2d photos to 3d. The latest open-source AI model, 'TripoSR,' is a collaborative effort between Stability AI, a pioneer in stability ai, and Tripo AI.

 

LRM: The 'Large Reconstruction Model' for 2D to 3D Conversion

Stability AI, the company behind the open-source text-to-image model Stable Diffusion, and Tripo AI, known for its 'text-to-3D model' generative AI tool, have joined forces. Their collaboration was inspired by a paper published by Adobe at the end of last year (November 2023), titled "LRM: Large Reconstruction Model for Single Image to 3D".

The paper introduces the so-called LRM (Large Reconstruction Model), which is based on the Transformer architecture and can 'reconstruct' a neural radiance field (NeRF)—in other words, a lifelike 3D object—from a flat image. Impressively, the LRM claims to complete the reconstruction of a 3D object in just 5 seconds.

By now, you can probably imagine the applications of LRM. Whether it's for entertainment and gaming in VR and virtual worlds, or professional fields such as industrial design and architecture, converting images to 3D without the need for manual modeling saves a significant amount of cost and time.

Is TripoSR Faster, Better, and More Affordable?

So, what's so special about TripoSR? Let's quickly summarize the key points.

According to official information, with the use of a single Nvidia A100 chip, it only takes 0.5 seconds for TripoSR to generate a 3D model from an image, which is significantly faster than LRM's 5 seconds. Moreover, the quality of the 3D models reconstructed by TripoSR is said to be superior to those by LRM. Yes, it's faster and better. But there's more. In addition to speed and quality, TripoSR has also cracked the design trilemma, which is the often impossible balance of speed, quality, and cost. Stability AI claims that unlike other large reconstruction models on the market, TripoSR can run even on a low inference budget and doesn't necessarily require a GPU.

The results seem impressive, almost unbelievably so. As AI progresses at such a rapid pace, especially in the realm of 3D, I can't help but worry about how fans of 2D art might feel. Curious, I also tried out their TripoSR Demo page. Let's take a look at the results.

anime character

jensen huang

anime characer arms dealer

rubber duck

After experimenting with a few images, including anime characters, figurines, a photo of AI arms dealer Jensen Huang, and a rubber duck, sometimes the 3D models came out a bit flat and required multiple attempts to get "right". The facial features of both real people and figurines weren't very clear and distinct, and the rubber duck probably had the best outcome in this batch of tests, although overexposed lighting could completely ruin the 3D object.

The TripoSR model code and weights are now available on GitHub and Hugging Face, so whether it can become even more accurate and useful is now up to developers and designers to experiment and see.