🌍 GeoLlama

Large Language and Vision Assistant for Earth Observation

Kha Do

A Visual Large Language Assistant for Multi-spectral Remote Sensing Data

Installation • Datasets • Training • Evaluation

Overview

Contributions

GeoLlama: Introduce GeoLlama, the large Vision-Language Model (VLM) for multi-spectral remote-sensing imagery.
- Propose a novel Grounded Spectral-aware Connector (GSC) module that integrates visible and non-visible spectral bands.
- Inject spectral knowledge into all cross-attention layers of the language model, enabling the full exploitation of spectral cues and expert knowledge across diverse remote sensing tasks.
GeoLlamaInstData: Construct GeoLlamaInstData, the large-scale instruction-following dataset for multi-spectral imagery.
- Pair spectral images with rich, object-centric, and deep analysis conversations.
- Provide rigorous training and benchmarking for VLMs in remote-sensing applications.
Experimental Results: Demonstrate that GeoLlama, by effectively leveraging spectral information, outperforms both general-purpose VLMs and specialized remote sensing VLMs.
- Achieve superior performance in multi-label classification, image description, and visual question answering tasks.

Installation

⚠️ Important: This project is designed for Linux. For other operating systems, please refer to:

macOS Instructions

Windows Instructions

1. Clone Repository

git clone https://github.com/ikhado/GeoLlama.git
cd GeoLlama

2. Setup Environment

# Create and activate conda environment
conda create -n GeoLlama python=3.10 -y
conda activate GeoLlama

# Upgrade pip and install package
pip install --upgrade pip
pip install -e .

3. Install Training Dependencies

# Install training-specific packages
pip install -e ".[train]"
pip install flash-attn --no-build-isolation

Updating to Latest Version

git pull
pip install -e .

📊 Datasets

🌐 Primary Dataset

Download the image-caption pairs from ChatEarthNet.

Data construction process:

📁 Available JSON Files

Dataset Type	File	Description
Pre-training	`GeoLlama_Pre_train.json`	Initial training dataset
Instruction Tuning	`GeoLlama_Instruct.json`	Fine-tuning dataset

🎯 Training

Model architecture:

We utilize the pre-trained backbone Llama-3.2-11B-Vision-Instruct and train the projector from scratch.

Training Scripts

Training Phase	Script	Dataset
Pre-training	`pre_train_llama32.sh`	Pre-training dataset
Visual Instruction Tuning	`fine_tune_llama32.sh`	Instruction fine-tuning dataset

📈 Evaluation

For evaluation procedures, refer to the evaluation script:

./geollama/eval/test.py

Visualization Results:

Acknowledgements

We gratefully acknowledge the following projects upon which our work is built:

Made with ❤️ for the Remote Sensing Community

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.devcontainer		.devcontainer
asset		asset
geollama		geollama
playground/data/prompts		playground/data/prompts
scripts		scripts
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cog.yaml		cog.yaml
predict.py		predict.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌍 GeoLlama

Large Language and Vision Assistant for Earth Observation

Overview

Contributions

📋 Table of Contents

Installation

1. Clone Repository

2. Setup Environment

3. Install Training Dependencies

Updating to Latest Version

📊 Datasets

🌐 Primary Dataset

📁 Available JSON Files

🎯 Training

Training Scripts

📈 Evaluation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌍 GeoLlama

Large Language and Vision Assistant for Earth Observation

Overview

Contributions

📋 Table of Contents

Installation

1. Clone Repository

2. Setup Environment

3. Install Training Dependencies

Updating to Latest Version

📊 Datasets

🌐 Primary Dataset

📁 Available JSON Files

🎯 Training

Training Scripts

📈 Evaluation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages