-
GeoLlama: Introduce GeoLlama, the large Vision-Language Model (VLM) for multi-spectral remote-sensing imagery.
- Propose a novel Grounded Spectral-aware Connector (GSC) module that integrates visible and non-visible spectral bands.
- Inject spectral knowledge into all cross-attention layers of the language model, enabling the full exploitation of spectral cues and expert knowledge across diverse remote sensing tasks.
-
GeoLlamaInstData: Construct GeoLlamaInstData, the large-scale instruction-following dataset for multi-spectral imagery.
- Pair spectral images with rich, object-centric, and deep analysis conversations.
- Provide rigorous training and benchmarking for VLMs in remote-sensing applications.
-
Experimental Results: Demonstrate that GeoLlama, by effectively leveraging spectral information, outperforms both general-purpose VLMs and specialized remote sensing VLMs.
- Achieve superior performance in multi-label classification, image description, and visual question answering tasks.
⚠️ Important: This project is designed for Linux. For other operating systems, please refer to:
git clone https://github.com/ikhado/GeoLlama.git
cd GeoLlama# Create and activate conda environment
conda create -n GeoLlama python=3.10 -y
conda activate GeoLlama
# Upgrade pip and install package
pip install --upgrade pip
pip install -e .# Install training-specific packages
pip install -e ".[train]"
pip install flash-attn --no-build-isolationgit pull
pip install -e .Download the image-caption pairs from ChatEarthNet.
Data construction process:
| Dataset Type | File | Description |
|---|---|---|
| Pre-training | GeoLlama_Pre_train.json |
Initial training dataset |
| Instruction Tuning | GeoLlama_Instruct.json |
Fine-tuning dataset |
Model architecture:
We utilize the pre-trained backbone Llama-3.2-11B-Vision-Instruct and train the projector from scratch.
| Training Phase | Script | Dataset |
|---|---|---|
| Pre-training | pre_train_llama32.sh |
Pre-training dataset |
| Visual Instruction Tuning | fine_tune_llama32.sh |
Instruction fine-tuning dataset |
For evaluation procedures, refer to the evaluation script:
./geollama/eval/test.pyVisualization Results:
We gratefully acknowledge the following projects upon which our work is built:



