GeoTransfer: Generalizable Few-Shot Multi-View Reconstruction via Transfer Learning

INRIA
ECCVW 2024

Fast training of GeoTransfer. Our transfer learning approach leads to a dramatic reduction in training time in learning a generalizable occupancy network offering SOTA 3D reconstruction performance. The training time reduces by orders of magnitude (i.e. from several days to 3.5 hrs)!

Abstract

This paper presents a novel approach for sparse 3D reconstruction by leveraging the expressive power of Neural Radiance Fields (NeRFs) and fast transfer of their features to learn accurate occupancy fields. Existing 3D reconstruction methods from sparse inputs still struggle with capturing intricate geometric details and can suffer from limitations in handling occluded regions. On the other hand, NeRFs excel in modeling complex scenes but do not offer means to extract meaningful geometry. Our proposed method offers the best of both worlds by transferring the information encoded in NeRF features to derive an accurate occupancy field representation. We utilize a pre-trained, generalizable state-of-the-art NeRF network to capture detailed scene radiance information, and rapidly transfer this knowledge to train a generalizable implicit occupancy network. This process helps in leveraging the knowledge of the scene geometry encoded in the generalizable NeRF prior and refining it to learn occupancy fields, facilitating a more precise generalizable representation of 3D space. The transfer learning approach leads to a dramatic reduction in training time, by orders of magnitude (i.e. from several days to 3.5 hrs), obviating the need to train generalizable sparse surface reconstruction methods from scratch. Additionally, we introduce a novel loss on volumetric rendering weights that helps in the learning of accurate occupancy fields, along with a normal loss that helps in global smoothing of the occupancy fields. We evaluate our approach on the DTU dataset and demonstrate state-of-the-art performance in terms of reconstruction accuracy, especially in challenging scenarios with sparse input data and occluded regions. We furthermore demonstrate the generalization capabilities of our method by showing qualitative results on the Blended MVS dataset without any retraining.

Comparison Results

Visual comparisons on the sparse reconstruction setting of DTU dataset.

Visual comparisons on the sparse novel-view synthesis setting of DTU dataset.

Visual comparison of surface reconstruction results on BlendedMVS dataset.

BibTeX

@article{jena2024geotransfer,
  title={GeoTransfer: Generalizable Few-Shot Multi-View Reconstruction via Transfer Learning},
  author={Jena, Shubhendu and Multon, Franck and Boukhayma, Adnane},
  journal={arXiv preprint arXiv:2408.14724},
  year={2024}
}