Reconstructing Part-level 3D Models from A Single Image

Dingfeng Shi,  Yifan Zhao,   Jia Li*,   

State Key Laboratory of Virtual Reality Technology and Systems, Beihang University

Understanding an image with 3D representations has been an increasingly attractive topic in computer vision. The state-ofthe-art 3D reconstruction methods usually focus on the reconstruction of the holistic object, while missing important part information, which is crucial in robotic interaction and virtual reality applications. To solve this issue, we make the first attempt to reconstruct the 3D models with part-level representations in a unified framework. With the input of the singleview images, we first develop a feature enhancement encoder to incorporate discriminative local features into the feature representation. The local features are selected adaptively by a learnable local awareness module. Then the enhanced local features are fused with the global branch to form the 3D representations. We then develop a 3D part generator to decode the image priors to 3D parts with a 3D focal loss, which enables the representations of small parts. Experimental results indicate that our model generates reliable part-level structures while achieving state-of-the-art performance in object-level recovering.


Overview of our 3D part-level reconstruction framework. Our framework is composed of a feature enhancement encoder and a 3D part generator. The input image is first passed by a VGG backbone to extract the image features and then fed into the feature enhancement module which enhances the local features with a the proposed LAM. After that, the overall features are fed into the 3D part generator to construct the holistic object with part-level information.

the Local Awareness Module

The illustration of Local Awareness Module. We use sliding window to crop local features to form a patch list. An adaptive feature weight is then calculated to rerank the local features. The top-K discriminative features with high responses are selected to enhance the global feature.


Visualized reconstruction results of baseline model and our final Base-LAM model.


Reconstruction Performance of mIoU on PartNet dataset. Part#i denotes the ith part of each category.

 Update logs

2020/05: We have updated the paper and code.


  • Dingfeng Shi, Yifan Zhao and Jia Li*. Reconstructing Part-level 3D Models from A Single Image.