3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion
By Naomi Wilson
Posted on: September 25, 2024
The increasing demand for high-quality 3D assets across various industries necessitates efficient and automated 3D content creation. Despite recent advancements in 3D generative models, existing methods still face challenges with optimization speed, geometric fidelity, and the lack of assets for phy...
Read More →
Colorful Diffuse Intrinsic Image Decomposition in the Wild
By Naomi Wilson
Posted on: September 25, 2024
Intrinsic image decomposition aims to separate the surface reflectance and the effects from the illumination given a single photograph. Due to the complexity of the problem, most prior works assume a single-color illumination and a Lambertian world, which limits their use in illumination-aware image...
Read More →
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
By Naomi Wilson
Posted on: September 25, 2024
Vision-based 3D occupancy prediction is significantly challenged by the inherent limitations of monocular vision in depth estimation. This paper introduces CVT-Occ, a novel approach that leverages temporal fusion through the geometric correspondence of voxels over time to improve the accuracy of 3D ...
Read More →
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
By Naomi Wilson
Posted on: September 22, 2024
Visual data comes in various forms, ranging from small icons of just a few pixels to long videos spanning hours. Existing multi-modal LLMs usually standardize these diverse visual inputs to a fixed resolution for visual encoders and yield similar numbers of tokens for LLMs. This approach is non-opti...
Read More →