Technical Documentation

NASA data sources and machine learning implementation

Used NASA Data

Landsat Collection 2

Surface Temperature

  • 30-meter resolution thermal data - perfect for neighborhood-level analysis
  • • Directly measures Land Surface Temperature (LST) in Kelvin
  • • Long historical record (1982-present) for training robust models
  • • Same spatial resolution as vegetation data = perfect alignment

Harmonized Landsat Sentinel-2 (HLS)

Vegetation Indices

  • 30-meter resolution NDVI and other vegetation indices
  • • Updates every 2-3 days (vs Landsat's 16 days)
  • Harmonized between Landsat and Sentinel-2 = consistent measurements
  • • Direct measure of vegetation health and density

MODIS Land Surface Temperature

MOD11/MYD11

  • Daily global coverage at 1km resolution
  • • Perfect for validating our higher-resolution Landsat results
  • Long-term baseline data for trend analysis
  • • Captures city-wide temperature patterns

Machine Learning Approach

Best Algorithm: Random Forest or XGBoost

Why these work best:

  • • Handle non-linear vegetation-temperature relationships
  • • Excellent performance (R² > 0.90) in urban heat island studies
  • • Robust to outliers and missing data
  • • Provide feature importance analysis
  • • Fast training and prediction

Performance Metrics

R² Score:> 0.90
Training Speed:Fast
Robustness:High

Model Architecture

Input Features

  • NDVI (primary vegetation indicator)
  • Land cover type (urban, park, water, etc.)
  • Distance to coastline (important for Málaga)
  • Elevation/topography
  • Building density metrics
  • Time of year (seasonal effects)

Target Variable

Land Surface Temperature (LST)

from Landsat thermal data

Training Process

1. Data Preparation

  • • Align Landsat thermal and HLS vegetation data to same 30m grid
  • • Create training dataset linking NDVI values to temperature
  • • Remove cloudy pixels using quality assessment bands

2. Feature Engineering

  • • Calculate vegetation indices (NDVI, EVI, SAVI)
  • • Derive spatial metrics (distance to green spaces, urban density)
  • • Add temporal features (day of year, seasonal indicators)

3. Model Training

  • • Split data using spatial cross-validation (prevent overfitting)
  • • Train Random Forest with hyperparameter tuning
  • • Validate performance using held-out spatial areas

Current PoC Implementation

For this Proof of Concept, we're using data from Landsat Collection 2 (100m resolution), mapping it to a 3D model of the city of Málaga.

This 3D model leverages OpenStreetMap's open-source geographic data, enabling us to generate accurate 3D representations of any city worldwide and seamlessly import them into Blender for visualization and further development.

Temperature deltas are calculated approximately, short of being connected to the trained machine learning models. This demonstrates the potential of our integrated approach for urban heat island analysis and green infrastructure planning.

Technical Stack

Data Source

Landsat Collection 2

3D Modeling

OpenStreetMap + Blender

Visualization

Three.js + React