DreamFusion
Recent breakthroughs in text-to-image synthesis have been driven by diffusion models trained on billions of image-text pairs. Adapting this approach to 3D synthesis would require large-scale datasets of labeled 3D assets and efficient architectures for denoising 3D data, neither of which currently exist. In this work, we circumvent these limitations by using a pre-trained 2D text-to-image diffusion model to perform text-to-3D synthesis. We introduce a loss based on probability density distillation that enables the use of a 2D diffusion model as a prior for optimization of a parametric image generator. Using this loss in a DeepDream-like procedure, we optimize a randomly-initialized 3D model (a Neural Radiance Field, or NeRF) via gradient descent such that its 2D renderings from random angles achieve a low loss. The resulting 3D model of the given text can be viewed from any angle, relit by arbitrary illumination, or composited into any 3D environment.
Learn more
GET3D
We generate a 3D SDF and a texture field via two latent codes. We utilize DMTet to extract a 3D surface mesh from the SDF and query the texture field at surface points to get colors. We train with adversarial losses defined on 2D images. In particular, we use a rasterization-based differentiable renderer to obtain RGB images and silhouettes. We utilize two 2D discriminators, each on RGB image, and silhouette, respectively, to classify whether the inputs are real or fake. The whole model is end-to-end trainable. As several industries are moving towards modeling massive 3D virtual worlds, the need for content creation tools that can scale in terms of the quantity, quality, and diversity of 3D content is becoming evident. In our work, we aim to train performant 3D generative models that synthesize textured meshes which can be directly consumed by 3D rendering engines, thus immediately usable in downstream applications.
Learn more
Text2Mesh
Text2Mesh produces color and geometric details over a variety of source meshes, driven by a target text prompt. Our stylization results coherently blend unique and ostensibly unrelated combinations of text, capturing both global semantics and part-aware attributes. Our framework, Text2Mesh, stylizes a 3D mesh by predicting color and local geometric details which conform to a target text prompt. We consider a disentangled representation of a 3D object using a fixed mesh input (content) coupled with a learned neural network, which we term neural style field network. In order to modify style, we obtain a similarity score between a text prompt (describing style) and a stylized mesh by harnessing the representational power of CLIP. Text2Mesh requires neither a pre-trained generative model nor a specialized 3D mesh dataset. It can handle low-quality meshes (non-manifold, boundaries, etc.) with arbitrary genus, and does not require UV parameterization.
Learn more
Fast3D
Fast3D is a lightning‑fast AI‑powered 3D model generator that transforms text prompts or single/multi‑view images into professional‑grade mesh assets with customizable texture synthesis, mesh density, and style presets, all in under ten seconds without any modeling experience. It combines high‑fidelity PBR material generation with seamless tiling and intelligent style transfer, delivers precise geometric accuracy for realistic structures, and supports both text‑to‑3D and image‑to‑3D workflows. Outputs are compatible with any pipeline, offering export in GLB/GLTF, FBX, OBJ/MTL, and STL formats, while its intuitive web interface requires no login or setup. Whether for gaming, 3D printing, AR/VR, metaverse content, product design, or rapid prototyping, Fast3D’s AI core enables creators to explore diverse ideas through batch uploads, random inspiration galleries, and adjustable quality tiers, bringing concepts to 3D reality in seconds rather than days.
Learn more