Nvidia Optimizes Part-Level 3D Generation With Dual Volume Packing | Best free ai image generator app | Best free ai image generator android | How to use dalle 3 in chat gpt on iphone | Turtles AI

Nvidia Optimizes Part-Level 3D Generation With Dual Volume Packing
A new technique organizes parts into complementary volumes, improving quality, variety and control in creating 3D objects from single images
Isabella V14 June 2025

 

Part-level 3D object generation from a single image, leveraging a “dual volume packing” strategy that enables isolated, complete, and assemblable semantic parts with high quality, diversity, and generalization.

Key points:

  • End-to-end generation of part-level 3D meshes from single images
  • Organization of parts into two complementary volumes to avoid fusion
  • Support for an arbitrary number of parts with semantic modeling
  • Improved quality, variety, and robustness compared to previous methods

Nvidia recently presented “Efficient Part-level 3D Object Generation via Dual Volume Packing” (Tang et al., 2025), a novel approach that overcomes the limitation of partitionless unified meshes, enabling accurate manipulation of individual components of a 3D object. The system, trained on datasets such as Objaverse-XL and based on VAE and transformer models with latent diffusion, accepts a single RGB image (scaled to 518×518) as input and produces meshes in GLB format with resolution up to 512³.

The core of the method is the “dual volume packing” strategy: by analyzing the connectivity between parts (contact graph), the components are divided into two non-adjacent groups, avoiding collisions and fusions in the 3D volume. Thanks to a heuristic edge contraction algorithm, even non-bipartite graphs are transformed into bipartite ones, maintaining the possibility of using only two fixed “volumes”, ensuring parallelization and efficiency. This choice avoids the excessive complexity associated with multi-level strategies: the results show an optimal use of space and absence of incomplete or merged parts.

The entire process is end-to-end, without the need for preliminary 2D or 3D segmentation, and does not increase in duration with the number of parts, unlike traditional methods. Qualitative and quantitative tests highlight a higher quality of the generated parts, a greater morphological variety and a better generalization to unseen objects, compared to previous solutions based on patches and sequential completion.

The official implementation is available under the non-commercial Nvidia license on GitHub and Hugging Face (released June 11, 2025), with interactive demos via Gradio. The model supports Ampere and Hopper GPUs, is written in PyTorch and integrates tools to process raw meshes in GLB and convert the parts into two separate volumes.

As anticipated in the paper, the framework generates complete part-level meshes in about 30 seconds per image, offering consistent times regardless of the quantity of components. This efficiency addresses the growing demand for 3D editing, animation, and robotics, facilitating modular and interoperable pipelines for complex digital applications.

Nvidia’s method introduces a robust and scalable solution for generating articulated 3D objects, ensuring semantic isolation of parts, efficient volume management, and high geometric fidelity.