3D reconstruction and generative models. I use neural and physical 3D representations to generate realistic 3D objects and scenes. The current focus is large-scale, dynamic, and interactable 3D scene generations. These generative models will be greatly useful for content creators, like games or movies, or for autonomous agent training in virtual environments. For my research, I frequently use and adopt generative modeling techniques such as auto-decoders, GANs, or Diffusion Models.
In my project “DeepSDF,” I suggested a new representation for a 3D generative model that made a breakthrough in the field. The question I answered is: “what should the 3D model be generating? Points, meshes, or voxels?” In DeepSDF paper, I proposed that we should generate a “function,” that takes input as a 3D coordinate and outputs a field value corresponding to that coordinate, where the “function” is represented as a neural network. This neural coordinate-based representation is memory-efficient, differentiable, and expressive, and is at the core of huge progress our community has made for 3D generative modeling and reconstruction.
3D faces with apperance and geometry generated by our AI model
Two contributions I would like to make. First, I would like to enable AI generation of large-scale, dynamic, and interactable 3D world, which will benefit entertainment, autonomous agent training (robotics and self-driving) and various other scientific fields such as 3D medical imaging. Second, I would like to devise a new and more efficient neural network architecture that mimics our brains better. The current AI models are highly inefficient in terms of how they learn from data (requires a huge number of labels), difficult to train continuously and with verbal/visual instructions. I would like to develop a new architecture and learning methods that address these current limitations.