Skip to content

An Open Framework for Architectural Caption and Attribute Data Enrichment via Street View Imagery

License

Notifications You must be signed in to change notification settings

seshing/OpenFACADES

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

113 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

text

OpenFACADES

An Open Framework for Architectural Caption and Attribute Data Enrichment via Street View Imagery

Overview

OpenFACADES is an open-source framework designed to enrich building profiles with objective attributes and semantic descriptors by leveraging multimodal crowdsourced data and large vision-language models. It provides tools for integrating diverse datasets, automating building facade detection, and generating detailed annotations at scale.

overview

Tutorial preview

1. Retrieving building image data: Open In Colab

2. Building image labeling and captioning using VLM: Open In Colab

What can our method do?

  1. Integrating multimodal crowdsourced data: acquire building data and street view imagery from crowdsourced platforms for selected areas, and conduct isovist analysis to integrate them.

  2. Retrieving building image data: perform object detection to identify target buildings in panoramic images and reproject them back to a holistic perspective view, with image filtering functions to select high-quality building images. detect

vlm
retrieving building image

  1. Establishing dataset and multimodal models: apply state-of-the-art multimodal large language models to annotate building images with multiple attributes, including building type, surface material, number of floors, and building age, and provide detailed descriptive captions.

vlm
(a) building attributes labeling

vlm
(b) image captioning

To Do List

  • Release code for building data harmonization.
  • Release code for integrating building and street view imagery data.
  • Develop Google Colab tutorial for retriving building image data.
  • Release training code for fine-tuning InternVL models.
  • Release training data.
  • Develop Google Colab tutorial for building labeling and captioning.
  • Release fine-tuned model (1B, 2B).
  • Integrate more SVI platforms into the framework.
  • Expand criteria for building image selection.

Installation

To install OpenFACADES, follow these steps:

  1. Clone the repository:
git clone https://github.com/seshing/OpenFACADES.git
  1. Install the package and required dependencies:
conda create -n openfacades
conda activate openfacades

pip install -e OpenFACADES/.
pip install -r OpenFACADES/requirements.txt

Note: The package used pytorch and torchvision, you may need to install them separately. Please refer to the official website for installation instructions.

Quick start

To acquire individual building images (Steps 1 & 2 above) for an area, you can simply run:

python OpenFACADES/run.py \
  --bbox=[left,bottom,right,top] \
  --api_key='YOUR_MAPILLARY_API_KEY'

Note: please check Mapillary has panoramic images available for the selected area.

Example bbox:
[8.552,47.372,8.554,47.376]: an area in Zurich, Switzerland;
[-81.382,28.540,-81.376,28.543]: an area in Orlando, the US;
[-70.660,-33.442,-70.655,-33.437]: an area in Santiago, Chile;
[-73.578,45.497,-73.569,45.502]: an area in Montreal, Canada;
[37.618,55.758,37.628,55.763]: an area in Moscow, Russia;
[25.273,54.684,25.283,54.687]: an area in Vilnius, Lithuania.

Output paths:
building footprint: output/01_data/footprint.geojson;
detected building images: output/02_img/individual_building;
building image ids after filtering: output/02_img/individual_building_select.csv.

Model Training

To finetune InternVL models for building facade analysis and captioning, see our detailed training guide:

📖 Fine-tuning Guide — Instructions for training InternVL models on building data

🗂️ OpenFACADES Training Dataset — Training data on Hugging Face

Use case

  1. Liang, X., Cheng, S., Biljecki, F. (2025, June). Decoding Characteristics of Building Facades Using Street ViewImagery and Vision-Language Model. In 19th International Conference on Computational Urban Planning & Urban Management, CUPUM 2025.
    https://osf.io/abyqh/files/osfstorage/685400519a7097303ec89a95

  2. Liang, X., Chang, J.H., Gao, S., Zhao, T. and Biljecki, F., 2024. Evaluating human perception of building exteriors using street view imagery. Building and Environment, 263, p.111875.
    https://doi.org/10.1016/j.buildenv.2024.111875

  3. Lei, B., Liang, X. and Biljecki, F., 2024. Integrating human perception in 3D city models and urban digital twins. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 10, pp.211-218.
    https://isprs-annals.copernicus.org/articles/X-4-W5-2024/211/2024/isprs-annals-X-4-W5-2024-211-2024.html

Citation

Please cite the following paper if you use OpenFACADES in a scientific publication:

@article{liang2025openfacades,
        title = {OpenFACADES: An open framework for architectural caption and attribute data enrichment via street view imagery},
        author = {Liang, Xiucheng and Xie, Jinheng and Zhao, Tianhong and Stouffs, Rudi and Biljecki, Filip},
        year = 2025,
        journal = {ISPRS Journal of Photogrammetry and Remote Sensing},
        volume = {230},
        pages = {918--942},
        issn = {0924-2716}
        }

Acknowledgement

We acknowledge the contributors of OpenStreetMap, Mapillary and other platforms for providing valuable open data resources and code that support street-level imagery research and applications. This project is also built with reference to the code of the following projects: InternVL, ZenSVI, GroundingDINO and Equirec2Perspec. Thanks for their awesome work!

About

An Open Framework for Architectural Caption and Attribute Data Enrichment via Street View Imagery

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages