Generating LOD3 building models from structure-from-motion and semantic segmentation

B. G. Pantoja-Rosero, R. Achanta, M. Kozinski, P. Fua, Fernando Perez-Cruz, K. Beyer: Generating LOD3 building models from structure-from-motion and semantic segmentation. En: Automation in Construction, vol. 141, pp. 104430, 2022, ISSN: 0926-5805.

Resumen

This paper describes a pipeline for automatically generating level of detail (LOD) models (digital twins), specifically LOD2 and LOD3, from free-standing buildings. Our approach combines structure from motion (SfM) with deep-learning-based segmentation techniques. Given multiple-view images of a building, we compute a three-dimensional (3D) planar abstraction (LOD2 model) of its point cloud using SfM techniques. To obtain LOD3 models, we use deep learning to perform semantic segmentation of the openings in the two-dimensional (2D) images. Unlike existing approaches, we do not rely on complex input, pre-defined 3D shapes or manual intervention. To demonstrate the robustness of our method, we show that it can generate 3D building shapes from a collection of building images with no further input. For evaluating reconstructions, we also propose two novel metrics. The first is a Euclidean–distance-based correlation of the 3D building model with the point cloud. The second involves re-projecting 3D model facades onto source photos to determine dice scores with respect to the ground-truth masks. Finally, we make the code, the image datasets, SfM outputs, and digital twins reported in this work publicly available in github.com/eesd-epfl/LOD3_buildings and doi.org/10.5281/zenodo.6651663. With this work we aim to contribute research in applications such as construction management, city planning, and mechanical analysis, among others.

BibTeX (Download)

@article{PANTOJAROSERO2022104430,
title = {Generating LOD3 building models from structure-from-motion and semantic segmentation},
author = {B. G. Pantoja-Rosero and R. Achanta and M. Kozinski and P. Fua and Fernando Perez-Cruz and K. Beyer},
url = {https://www.sciencedirect.com/science/article/pii/S092658052200303X},
doi = {https://doi.org/10.1016/j.autcon.2022.104430},
issn = {0926-5805},
year  = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
journal = {Automation in Construction},
volume = {141},
pages = {104430},
abstract = {This paper describes a pipeline for automatically generating level of detail (LOD) models (digital twins), specifically LOD2 and LOD3, from free-standing buildings. Our approach combines structure from motion (SfM) with deep-learning-based segmentation techniques. Given multiple-view images of a building, we compute a three-dimensional (3D) planar abstraction (LOD2 model) of its point cloud using SfM techniques. To obtain LOD3 models, we use deep learning to perform semantic segmentation of the openings in the two-dimensional (2D) images. Unlike existing approaches, we do not rely on complex input, pre-defined 3D shapes or manual intervention. To demonstrate the robustness of our method, we show that it can generate 3D building shapes from a collection of building images with no further input. For evaluating reconstructions, we also propose two novel metrics. The first is a Euclidean\textendashdistance-based correlation of the 3D building model with the point cloud. The second involves re-projecting 3D model facades onto source photos to determine dice scores with respect to the ground-truth masks. Finally, we make the code, the image datasets, SfM outputs, and digital twins reported in this work publicly available in github.com/eesd-epfl/LOD3_buildings and doi.org/10.5281/zenodo.6651663. With this work we aim to contribute research in applications such as construction management, city planning, and mechanical analysis, among others.},
keywords = {3D building models, Deep learning, Digital twin, LOD models, Masonry buildings, Structure from motion},
pubstate = {published},
tppubtype = {article}
}