Resources
Here we list a bunch of useful resources for getting started with geospatial machine learning. By no means is this an exhaustive list, but just a few examples that ML4GEO members have found useful.
For a more comprehensive list the one provided by satellite-image-deep-learning is pretty awesome! 😎
Intro to GeoML
Open Access Courses:
- GeoSMART Curriculum
- WV View courses in GIS, remote sensing, and geospatial deep learning.
- RadiantEarth ML4EO Bootcamp 2021
Other introductory resources:
- Pytorch Tutorials: Popular machine learning library for python.
- Pytorch Lightning: High-level interface for PyTorch
- Torchgeo Tutorials: The
package is an extension to PyTorch and Pytorch Lightning to include popular datasets, model architectures, and common image transformations for geospatial data.
Geospatial foundation models
Below we’ve collated a list of helpful links and tutorials to get you started on Geospatial foundation models. By no means is this an exhaustive
Foundation Models / Tutorials
TerraTorch: flexible fine-tuning framework for Geospatial Foundation Models (GFMs) based on TorchGeo and Lightning, supports models from the Prithvi, TerraMind, and Granite series as well as models from TorchGeo and timm.
- https://github.com/IBM/terratorch/tree/main/examples/notebooks
TerraMind: The latest GFM from ESA, can be applied for classical deep learning tasks, as well as generative tasks (e.g. S1 from S2 data)
- https://github.com/IBM/ML4EO-workshop-2025
- Examples of disaster response with S1 and S2; multi-temporal data crop segmentation
Google Satellite v1 Embeddings:
- We have scripts to help download rasters of embeddings with python API
Earth Index:
- https://www.earthgenome.org/earth-index
- No coding tool with great interface to find similar satellite images tiles in an active learning environment, currently early access only
IBM-NASA Prithvi Models: Three foundation models have been released to date: Prithvi-EO-1.0 and Prithvi-EO-2.0 which uses earth observation data from NASA’s Harmonized Landsat and Sentinel-2 (HLS), and Prithvi-WxC-1.0 which uses weather and climate data from NASA’s MERRA-2.
- https://huggingface.co/ibm-nasa-geospatial
SatCLIP: Predict location coordinates given satellite imagery
- https://github.com/microsoft/satclip/tree/main/notebooks
Clay: EO foudnation model trained on Landsat, S2, S1, NAIP, LINZ, MODIS
- https://clay-foundation.github.io/model/index.html
- Has a nice tutorial visualising embeddings
DOFA: a unified multimodal foundation model for different data modalities in remote sensing and Earth observation.
Aurora: A Foundation Model for the Earth System.
Embeddings - Things to Consider when using the Google Ones
- https://www.linkedin.com/posts/mbforr_ai-isnt-magic-its-math-and-sometimes-activity-7353415998424711169-CKsX?utm_source=share&utm_medium=member_desktop&rcm=ACoAADV71oUBf4-Yt6U7QfjkZLmIRivB7oVFbHA
Technical Understanding of Key Concepts
- https://developers.google.com/machine-learning/crash-course/embeddings/embedding-space
Challenges
- ESA Phi Lab - AI4EO challenges
- AI4EO challenges
- Terramind Blue Sky challenge
- Zindi - Amini GeoFM Decoding the Field Challenge
ML-ready datasets
Some useful dataset created with machine learning in mind.
Large ML datahubs
Here are some datahubs which host many ML-ready training sets:
- Source Cooperative
- Torchgeo DataModules - torchgeo has a number of datasets readily available to import prepped in a format read for ML. If you are using the TerraTorch, this is particularly useful as it is built on top of TorchGeo so can be used off-the-shelf.
Unlabelled
- ESA’s MajorTOM dataset - 50+TB of Sentinel-1 and -2 data and DEMs.
Labelled
- PhilEO - 400GB Sentinel-2 dataset of the PhilEO Bench containing labels for the three downstream tasks of building density estimation, road segmentation, and land cover classification.