Nothing's in my cart
2-minute read
The evolution of generative AI applications, including AI 3D model generator, is a feast for the eyes. From OpenAI Sora's hyper-realistic videos to the recently introduced Google Genie's AI-generated game graphics, the generative AI field is now venturing into a domain closely tied to VR technology—transforming 2D images into 3D objects with the ability to convert 2d photos to 3d. The latest open-source AI model, 'TripoSR,' is a collaborative effort between Stability AI, a pioneer in stability ai, and Tripo AI.
Today we are releasing TripoSR in collaboration with @StabilityAI .
— Tripo (@tripoai) March 4, 2024
TripoSR is a new image-to-3D model capable of creating high-quality outputs in less than a second. pic.twitter.com/7UF8iKWaHR
Stability AI, the company behind the open-source text-to-image model Stable Diffusion, and Tripo AI, known for its 'text-to-3D model' generative AI tool, have joined forces. Their collaboration was inspired by a paper published by Adobe at the end of last year (November 2023), titled "LRM: Large Reconstruction Model for Single Image to 3D".
The paper introduces the so-called LRM (Large Reconstruction Model), which is based on the Transformer architecture and can 'reconstruct' a neural radiance field (NeRF)—in other words, a lifelike 3D object—from a flat image. Impressively, the LRM claims to complete the reconstruction of a 3D object in just 5 seconds.
By now, you can probably imagine the applications of LRM. Whether it's for entertainment and gaming in VR and virtual worlds, or professional fields such as industrial design and architecture, converting images to 3D without the need for manual modeling saves a significant amount of cost and time.
So, what's so special about TripoSR? Let's quickly summarize the key points.
According to official information, with the use of a single Nvidia A100 chip, it only takes 0.5 seconds for TripoSR to generate a 3D model from an image, which is significantly faster than LRM's 5 seconds. Moreover, the quality of the 3D models reconstructed by TripoSR is said to be superior to those by LRM. Yes, it's faster and better. But there's more. In addition to speed and quality, TripoSR has also cracked the design trilemma, which is the often impossible balance of speed, quality, and cost. Stability AI claims that unlike other large reconstruction models on the market, TripoSR can run even on a low inference budget and doesn't necessarily require a GPU.
The results seem impressive, almost unbelievably so. As AI progresses at such a rapid pace, especially in the realm of 3D, I can't help but worry about how fans of 2D art might feel. Curious, I also tried out their TripoSR Demo page. Let's take a look at the results.
After experimenting with a few images, including anime characters, figurines, a photo of AI arms dealer Jensen Huang, and a rubber duck, sometimes the 3D models came out a bit flat and required multiple attempts to get "right". The facial features of both real people and figurines weren't very clear and distinct, and the rubber duck probably had the best outcome in this batch of tests, although overexposed lighting could completely ruin the 3D object.
The TripoSR model code and weights are now available on GitHub and Hugging Face, so whether it can become even more accurate and useful is now up to developers and designers to experiment and see.