loading page

Text-Guided Real-World-to-3D Generative Models with Real-Time Rendering on Mobile Devices
  • Vu Truong ,
  • Long Bao Le
Vu Truong

Corresponding Author:[email protected]

Author Profile
Long Bao Le
Author Profile


Recent generative diffusion models are attracting enormous attention with various breakthroughs in text-to-image, text-guided image-to-image, and text-to-3D generation.
In this paper, we propose MobileGen3D, a bridge between text-driven real-world-to-3D generation and real-time on-device rendering. Given several real-world images of a person/object and a text prompt, MobileGen3D can provide a 3D model of the given content which has been customized according to the text prompt and can be rendered on mobile devices in real-time. No additional 3D training data is required in our method. Based on neural light fields (NeLF), MobileGen3D speeds up the inference process dramatically compared to other 3D synthesis methods that rely on neural radiance fields (NeRF).
As a result, we demonstrate that our method can generate high-resolution 3D contents with realistic edits and low disk storage requirement of just 6.48 MB. These 3D contents can be rendered directly by mobile devices and augmented/virtual reality devices with a high rendering speed of 61.2 FPS on our experimented iPhone 14.
Our implementation is available with detailed guidelines at this page: https://github.com/tuanvu171/MobileGen3D