Skip to content

CoBrALab/AllenHumanReferenceAtlas_Classified

Repository files navigation

tissue-classify

Collapse a fine-grained Allen Developing Human Brain Atlas label volume into an 9-class tissue segmentation, and document exactly how every structure was assigned to a tissue type.

The atlas labels ~3,300 named neuroanatomical structures. For tissue-level work (segmentation priors, registration targets, volumetrics) those need to be reduced to a handful of tissue classes: gray matter, deep gray matter, white matter, ventricles, CSF, cerebellar gray/white, and brain stem. This repository derives that reduction deterministically from the atlas ontology itself and applies it to the label volume.


Inputs

The ontology and label volume come from the Allen Human Reference Atlas – 3D, 2020:

The T1 image the brain mask is derived from is the MNI ICBM152 2009b nonlinear symmetric template: http://www.bic.mni.mcgill.ca/~vfonov/icbm/2009/mni_icbm152_nlin_sym_09b_nifti.zip

File What it is
voxel_count.csv The Allen ontology: one row per structure (3,317 rows). Columns: id, graph_order, acronym, name, color_hex_triplet, parent_structure_id, structure_id_path, annotated, voxel_count, subgraph_annotated, subgraph_voxel_count, volume_mm3, volume_cm3.
map.csv The 9 target tissue classes. Columns: value, shortName, description.
annotation_full.nii.gz The label volume — each voxel holds an Allen structure id. uint32, 394×466×378, 0.5 mm isotropic.
mni_icbm152_t1_tal_nlin_sym_09b_hires_synthstrip_mask.nii.gz SynthStrip brain mask on the same grid (1 = brain, 0 = non-brain). Used to source CSF (class 1) spatially — see below.

The 9 tissue classes (map.csv)

value shortName description
0 background unlabelled voxels
1 cerebrospinalFluid CSF excluding the ventricles (outside brain)
2 grayMatter gray matter tissue
3 whiteMatter white matter tissue
4 ventricle ventricles (CSF inside the brain)
5 cerebellarGrayMatter gray matter tissue in the cerebellum
6 cerebellarWhiteMatter white matter tissue in the cerebellum
7 brainStem brain stem
8 deepGrayMatter deep gray matter (subcortical/diencephalic nuclei)

Ontology provenance & reference documentation

The ontology in voxel_count.csv is the Allen Brain Map StructureGraph used by the live API. The API tags these structures graph_id = 16, ontology_id = 11.

Only the appended per-volume columns (annotated, voxel_count, subgraph_voxel_count, volume_mm3, volume_cm3) are extra; the hierarchy the classifier walks (structure_id_path, acronyms) is 1:1 with the API. The whole graph can be pulled live:

curl "https://api.brain-map.org/api/v2/data/query.json?criteria=model::Structure,rma::criteria,\[graph_id\$eq16\]&num_rows=5000"

Reference documentation for the hierarchy:

Resource What it covers
README.pdf (ships beside the inputs) Authoritative doc for this atlas: the 141-structure annotation volume, the hierarchical ontology, and the ITK-SNAP label files.
Atlas Drawings and Ontologies — Allen API How a StructureGraph works: each structure (except root) has one parent = a "part-of" edge — the model behind structure_id_path. Ontology downloadable as hierarchical JSON.
Allen Atlas Viewer Browse the structure tree interactively.
Brain Map community thread Release announcement and Q&A.
Human Reference Atlas ontologies paper (PMC10043028) Peer-reviewed context on the specimen/structure/spatial ontologies.

Mapping labels to tissue classes

voxel_count.csv and map.csv share no common column. Nothing in the atlas row for "precentral gyrus" says it is gray matter; nothing says the cerebral aqueduct is a ventricle. The mapping has to be inferred from anatomy.

The Allen ontology already encodes the tissue divisions in its own tree structure. Every major brain division splits into identically-named sub-branches:

brain
├── forebrain (F)
│   ├── gray matter of forebrain        (FGM)
│   ├── white matter of forebrain       (FWM)
│   ├── ventricles of forebrain         (FV)
│   ├── surface structures of forebrain (FSS)   ← gyri, sulci, surface nuclei
│   ├── transient structures (FTS)
│   └── blood vessels (fbv)
├── midbrain (M)
│   ├── gray matter of midbrain  (MGM)
│   ├── white matter of midbrain (MWM)
│   ├── ventricle of midbrain    (MV)
│   ├── surface structures (MSS)
│   └── blood vessels (mbv)
└── hindbrain (H)
    ├── gray matter of the hindbrain (HGM)
    │   ├── metencephalon (Met)
    │   │   ├── cerebellum (CB) ── cerebellar cortex + deep nuclei
    │   │   └── pons (Pn)
    │   └── myelencephalon / medulla (Mo)
    ├── white matter of hindbrain (HWM)   ← fiber tracts, cerebellar peduncles
    ├── ventricles of hindbrain (HV)
    ├── surface structures (HSS)
    │   └── surface structures of cerebellum (CbSS) ── lobules + fissures
    └── blood vessels (hbv)

So tissue type can be read off which sub-branch a structure lives in.


Classification method

Hierarchy is read from structure_id_path

Each row carries its full ancestry as a path, e.g.

/10153/10154/10155/10156/10157/10158/10159/10313/10327/10329/

This is the authoritative parent chain. (The parent_structure_id column is float-formatted, e.g. 10153.0, and does not key cleanly against the integer id, so it is not used.)

Anchors + "deepest anchor wins"

A small set of branch nodes are designated anchors, each tagged with a tissue value. To classify any structure, walk its structure_id_path from the structure itself upward toward the root and take the tissue value of the first anchor encountered (i.e. the deepest / most specific anchor on the path). If no anchor is on the path, the structure is background (0).

"Deepest wins" is what makes nested overrides work automatically. For example the cerebellum (CB) sits inside "gray matter of the hindbrain" (HGM). HGM is an anchor → brain stem (7), but CB is a deeper anchor → cerebellar gray (5). A cerebellar leaf hits CB before HGM, so it correctly resolves to 5, not 7.

Anchors are keyed by acronym and resolved to ids at runtime, so the logic survives id changes in future atlas releases.

The complete anchor table

29 anchors, exactly as defined in classify_tissue.py:

Anchor id Structure name → tissue value Rationale
FGM 10157 gray matter of forebrain grayMatter 2 Default for forebrain gray = cerebral cortex. The deep gray nuclei below override to deepGrayMatter(8).
CN 10331 cerebral nuclei deepGrayMatter 8 Basal ganglia (caudate, putamen, globus pallidus, nucleus accumbens), amygdala, claustrum, basal forebrain, septum, BNST. Deeper than FGM → overrides.
THM 10390 thalamus deepGrayMatter 8 All thalamic nuclei + habenula, pineal, zona incerta. Deeper than FGM → overrides.
SubTH 10465 subthalamus deepGrayMatter 8 Subthalamic nucleus. Deeper than FGM → overrides.
HTH 10467 hypothalamus deepGrayMatter 8 Preoptic/supraoptic/tuberal/mammillary regions. Deeper than FGM → overrides.
FWM 10557 white matter of forebrain whiteMatter 3 Fiber tracts: corpus callosum, internal capsule, fornix, etc.
FV 10595 ventricles of forebrain ventricle 4 Lateral + third ventricles and their horns.
FSS 10609 surface structures of forebrain grayMatter 2 Default for the surface branch — dominated by cortical gyri/lobules (the macroscopic gray-matter parcellation) plus surface gray nuclei (mammillary body, etc.). Overridden below for non-gray members.
CeS 10610 cerebral sulci cerebrospinalFluid 1 Sulci & fissures (central sulcus, Sylvian/longitudinal/calcarine fissures…) are the infolded subarachnoid CSF spaces. Deeper than FSS, so it overrides.
ASFV 146034908 adjoining structures of forebrain ventricles cerebrospinalFluid 1 Peri-ventricular CSF-adjacent structures. Overrides FSS.
fbv 266441685 blood vessels of forebrain background 0 Vasculature is not one of the 9 tissue classes.
FTS 10506 transient structures of forebrain background 0 Transient developmental structures (germinal zones, subplate) have no tissue class. Reviewable default.
MGM 10649 gray matter of midbrain brainStem 7 Midbrain is part of the brain stem; map.csv has no separate midbrain GM/WM, so the whole midbrain collapses to brainStem.
MWM 10650 white matter of midbrain brainStem 7 Same — midbrain fiber tracts are brain stem.
MV 10651 ventricle of midbrain ventricle 4 Cerebral aqueduct.
MSS 10652 surface structures of midbrain brainStem 7 Tectal surface, colliculi, nerve roots — brain stem surface.
mbv 266441705 blood vessels of midbrain background 0 Vasculature.
HGM 10654 gray matter of the hindbrain brainStem 7 Default for hindbrain gray = pons + medulla → brain stem. Cerebellum (deeper) overrides.
HWM 10668 white matter of hindbrain cerebellarWhiteMatter 6 Catch-all hindbrain WM whose bulk voxels are the cerebellar internal white matter (arbor vitae) — CB is filed under the gray branch (HGM), so its internal WM has no node and lands here. Spatial check: 100% of HWM voxels lie inside the cerebellum bbox, 85.5% closer to the cerebellum than the brainstem. Its peduncle children (icp, mcp) inherit 6.
HV 10669 ventricles of hindbrain ventricle 4 Fourth ventricle, central canal.
HSS 10670 surface structures of hindbrain brainStem 7 Default for hindbrain surface = pons/medulla surface. Cerebellar surface (deeper) overrides.
hbv 266441737 blood vessels of hindbrain background 0 Vasculature.
HTS 10663 transient structures of hindbrain background 0 Transient developmental structures. Reviewable default.
CB 10656 cerebellum cerebellarGrayMatter 5 Cerebellar cortex + deep nuclei (both gray). Deeper than HGM → overrides brainStem.
CbSS 12827 surface structures of cerebellum cerebellarGrayMatter 5 Cerebellar lobes/lobules (gray). Deeper than HSS → overrides brainStem.
cbf 12828 cerebellar fissures cerebrospinalFluid 1 Cerebellar fissures = CSF. Deeper than CbSS → overrides cerebellar gray.
scp 12354 superior cerebellar peduncle cerebellarWhiteMatter 6 Cerebellar WM tract filed under MWM (midbrain). Deeper than MWM → overrides brainStem.
xscp 12337 decussation of superior cerebellar peduncle cerebellarWhiteMatter 6 Same; 0 voxels in this volume but anatomically cerebellar WM. (icp/mcp under HWM inherit 6, no anchor needed.)
SpC 12890 spinal cord background 0 Not a brain tissue class. Reviewable default.

Default (no anchor on path): background (0). This catches the abstract container nodes above the GM/WM/V split — neural plate, neural tube, brain, forebrain, midbrain, hindbrain — which never appear as voxel labels.

Worked resolution examples

Structure (acronym, path tail) Anchors hit (leaf→root) Result
precentral gyrus (PrCG) FSS gray (2)
putamen (Pu) CNFGM deep gray (8)
thalamus (Pul) THMFGM deep gray (8)
head of hippocampus (HiH) FSS gray (2)
longitudinal fissure (los) CeSFSS CSF (1)
corpus callosum (cc) FWM white (3)
third ventricle (3V) FV ventricle (4)
cerebellar lobule (under CBLL) CbSSHSS cerebellar gray (5)
cerebellar fissure (under cbf) cbfCbSSHSS CSF (1)
pons tegmentum (PnTg) HGM (via Met/Pn) brain stem (7)
red nucleus (RN, midbrain) MGM/MWM brain stem (7)
spinal cord leaf SpC background (0)

Key design decisions

  • Midbrain + pons + medulla → a single brainStem (7). map.csv defines no separate brain-stem gray/white, and lists cerebellar gray/white as the only posterior-fossa subdivisions. So the entire midbrain and the non-cerebellar hindbrain collapse to brain stem, regardless of their internal gray/white organization. (Their ventricles still go to ventricle (4), since those branches are siblings of the GM/WM nodes, never nested under them.)

  • Gyri are gray matter; sulci/fissures are CSF. The FSS "surface structures" branch is a macroscopic landmark parcellation containing both the cortical gyri (CeG, gray) and the sulci/fissures (CeS, the CSF-filled subarachnoid clefts). They are split accordingly. This is the one place where a naive "whole branch → one tissue" rule fails, and the deepest-wins override (CeS → CSF inside FSS → gray) handles it.

  • Deep gray nuclei → deepGrayMatter (8). The subcortical/diencephalic nuclei — basal ganglia, amygdala, thalamus, hypothalamus, subthalamic nucleus — are siblings of cortex under FGM. The four branch anchors CN/THM/SubTH/ HTH (deeper than FGM) pull them into class 8, leaving cortex in class 2. Scope is deliberately forebrain only: the midbrain deep nuclei (substantia nigra SN, red nucleus RN) stay brainStem (7) to keep the "midbrain = brain stem" rule intact, and the cerebellar deep nuclei (CbDN) stay cerebellarGrayMatter (5). The hippocampus stays grayMatter (2): the atlas files it under the surface branch FSS (allocortex), not under CN, so the deep-gray anchors never reach it.


CSF (class 1) is sourced from the brain mask, not the ontology

The Allen annotation carries no CSF voxels: 106 sulci/fissure structures map to class 1 in the ontology, but none of them are segmented in annotation_full.nii.gz. Subarachnoid/sulcal CSF is therefore defined spatially instead, in apply_classification.py, using the SynthStrip brain mask:

every voxel that is inside the brain mask (mask == 1) but left unlabelled by the annotation (classify == 0) becomes CSF (1).

The mask was generated with SynthStrip:

synthstrip -i mni_icbm152_t1_tal_nlin_sym_09b_hires.nii.gz \
           -m mni_icbm152_t1_tal_nlin_sym_09b_hires_synthstrip_mask.nii.gz

These are exactly the CSF-filled clefts the parenchymal atlas does not cover. The fill adds 2,811,582 CSF voxels (only 1,060 annotation voxels fall outside the mask). After this step every class 1–8 is populated in the volume.

6 cerebellarWhiteMatter — cerebellar internal WM + peduncles

The atlas has no cerebellar-white-matter node under the cerebellum (CB divides only into cortex CBC and deep nuclei CbDN, both gray; there is no arbor-vitae / internal-WM node). The cerebellar white matter is instead recovered from two places the ontology files outside CB:

  1. HWM (white matter of hindbrain) — the bulk. This catch-all's voxels are the cerebellar internal white matter (arbor vitae), because CB is filed under the gray branch (HGM) and its internal WM has no dedicated node. A spatial check on annotation_full.nii.gz confirms it: 100% of HWM voxels lie inside the cerebellum bounding box, 85.5% are closer to the cerebellum centroid than to the brainstem, and the HWM centroid (z,y,x)=(69,152,196) nearly coincides with the cerebellum-gray centroid (71,142,196). So HWM → 6.

  2. The three cerebellar peduncles, filed under midbrain/hindbrain WM:

    peduncle acronym id voxels (csv) filed under
    superior cerebellar peduncle scp 12354 2,338 MWM (midbrain WM)
    inferior cerebellar peduncle icp 12741 1,592 HWM (hindbrain WM)
    middle cerebellar peduncle mcp 12768 10,905 HWM (hindbrain WM)
    decussation of scp xscp 12337 0 MWM

    scp/xscp (under MWM) are explicit anchors → 6 (deeper than MWM, so deepest-wins overrides brainStem). icp/mcp simply inherit HWM's 6.

Together these give 143,819 csv voxels / 287,566 in the volume for class 6 (cerebellar gray:white ≈ 3.9:1, anatomically reasonable).

Residual trade-off. HWM is one atomic label in the volume, so the ~14.5% of its voxels near the pons (transitional pontocerebellar / MCP-entry fibers) also go to class 6. This is acceptable — they are cerebellar-bound fibers — and far better than the alternative, which would mislabel the 85.5% cerebellar core as brainstem. The remaining class 7 (midbrain + pons + medulla gray + their named tracts, 139,773 csv voxels) is the true brainstem.


Class distribution

value tissue structures (all 3,317) structures (141 voxel-bearing)
0 background 788 0
1 cerebrospinalFluid 106 0
2 grayMatter 1,198 57
3 whiteMatter 140 8
4 ventricle 23 9
5 cerebellarGrayMatter 86 4
6 cerebellarWhiteMatter 91 4
7 brainStem 530 12
8 deepGrayMatter 355 47

Only 141 of the 3,317 ontology nodes are annotated=True (carry voxels); the rest are abstract intermediate nodes. Every one of the 141 voxel-bearing structures resolves to a real tissue class (1–8) — none fall to background, which is the primary correctness check. (Class 1 / CSF has 0 voxel-bearing structures because its voxels come from the brain-mask fill, not the ontology.)


Pipeline & usage

One command builds everything from scratch — download → brain mask → relabel → priors:

./generate_all.sh

It downloads the external inputs, runs SynthStrip for the brain mask (skipped if the mask is already present; override with SYNTHSTRIP=<cmd>), then runs all four stages below. The Python deps (SimpleITK + numpy, declared in requirements.txt) are pulled in per stage by uv run --with-requirements, so no virtual environment is managed by hand. Inputs and mask are reused if present; the pipeline stages always re-run.

Requirements: curl, unzip, gzip, uv, and SynthStrip (only if the mask is not already present).

To run the stages by hand instead:

# Stage 0 — download the external inputs (Allen ontology + volume, MNI T1)
./generate_all.sh   # (also runs stages 1–4; see above)

# Stage 1 — build the lookup tables (stdlib only)
python3 classify_tissue.py

# Stage 2 — relabel the volume + spatial CSF fill (SimpleITK)
uv run --with-requirements requirements.txt python apply_classification.py

# Stage 3 — probabilistic tissue priors for ANTs Atropos (SimpleITK)
uv run --with-requirements requirements.txt python make_priors.py

# Stage 4 — coarser schemes (4-class, 3-class, cbmerge): volumes + priors
uv run --with-requirements requirements.txt python make_coarse.py

classify_tissue.py

  1. Reads voxel_count.csv with csv.DictReader (the name column contains commas — a real CSV parser is required).
  2. Resolves the 29 anchor acronyms to ids and builds the anchor→value table.
  3. Classifies every structure by the deepest-anchor-on-path rule.
  4. Writes the LUTs, numerically sorted by id: tissue_map.lut (all nodes), tissue_map_annotated.lut (voxel-bearing nodes), and tissue_map_background.lut (a visualization aid mapping background-class structures → 1, else 0).
  5. Prints an audit: per-class counts, and flags any voxel-bearing structure that fell to background (would indicate a misclassification to review).

apply_classification.py

  1. Loads tissue_map.lut (the all-nodes superset) as the single source of truth — the relabeling logic is not duplicated here.
  2. Reads annotation_full.nii.gz into a numpy array.
  3. Sparse vectorized remap. Ids reach ~2.7 × 10⁸, so a dense lookup array is avoided. Instead: u = np.unique(arr), build the small vals array for just those ids, then out = vals[np.searchsorted(u, arr)]. Voxels with id 0 or any id absent from the LUT → background.
  4. Spatial CSF fill. Reads the SynthStrip brain mask (grid asserted to match) and sets out[(mask == 1) & (out == 0)] = 1, sourcing the CSF class the annotation does not carry (see "CSF (class 1)" above).
  5. Writes classify.nii.gz as uint8, calling CopyInformation to preserve spacing, origin and direction.
  6. Runs the verification below.

make_priors.py

Converts the hard classify.nii.gz into soft spatial priors:

  1. For each tissue class 1–8, builds a binary mask and blurs it with a 2 mm FWHM Gaussian (sigma = FWHM / (2·√(2·ln2)) = 0.8493 mm). Uses sitk.DiscreteGaussian (a non-negative separable kernel, spacing-aware)
  2. Normalizes across classes per voxel: prior_c = smoothed_c / Σ smoothed wherever the total support exceeds 1e-4; elsewhere all priors are 0. The result is a proper per-voxel probability distribution over the 8 classes.
  3. Writes prior01.nii.gz … prior08.nii.gz (float32, 0–1), priorNN ↔ class NN, preserving geometry via CopyInformation.

Worked example (one axial slice)

The figures below are all the same axial slice, rendered by make_figures.py (run it after a build: uv run --with-requirements requirements.txt --with matplotlib python make_figures.py).

Stage 0 — inputs: the MNI T1 (the SynthStrip mask is derived from this) and the Allen annotation volume (each colour = a distinct structure id):

Stage 0 inputs

Stage 1 builds the text LUTs (tissue_map*.lut); stage 2 applies them to the annotation and adds the mask-sourced CSF, giving the 9-class segmentation:

Stage 2 classification

Stage 3 — the eight probabilistic priors (smoothed, normalized to 0–1). At this slice prior06 (cerebellar WM) is nearly empty, as expected:

Stage 3 priors

Stage 4 — the coarser schemes. cbmerge keeps brain stem (brown), cerebellum (purple) and deep gray (pink) distinct; 4-class/3-class fold further:

Stage 4 coarse schemes


Outputs

File Format Contents
tissue_map.lut id value, space-separated, 3,317 rows All ontology nodes → tissue value.
tissue_map_annotated.lut id value, space-separated, 141 rows Only the voxel-bearing structures.
classify.nii.gz NIfTI, uint8, values 0–8 The tissue segmentation (same grid as the input volume), incl. mask-sourced CSF.
prior01.nii.gz … prior08.nii.gz NIfTI, float32, 0–1 Per-class probabilistic priors (priorNN ↔ class NN), Atropos-ready.

classify_tissue.py also writes tissue_map_background.lut (a visualization aid: background-class structures → 1, else 0). The coarser schemes add classify_<tag>.nii.gz and prior_<tag>_0N.nii.gz — see Coarser schemes.

Example LUT rows:

10329 2
12369 4
12384 5

Verification

Four independent checks, all passing:

  1. No voxel-bearing structure falls to background. All 141 annotated=True structures resolve to a tissue class 1–8.

  2. The output volume contains only valid tissue values. np.unique(classify.nii.gz) == [0, 1, 2, 3, 4, 5, 6, 7, 8] — every class 1–8 is populated (class 1 via the brain-mask fill, the rest from the annotation).

  3. Per-structure voxel-count cross-check against voxel_count.csv. Aggregating the csv voxel_count column by tissue value and comparing to the actual voxel counts in classify.nii.gz:

    class                    volume voxels   csv voxels   ratio
    2 grayMatter                 7144564      3572282   2.000
    3 whiteMatter                4393397      2207402   1.990
    4 ventricle                   233697       120733   1.936
    5 cerebellarGrayMatter       1115998       557999   2.000
    6 cerebellarWhiteMatter        287566       143819   1.999
    7 brainStem                   275416       139773   1.970
    8 deepGrayMatter              512772       257193   1.994
    

    Every leaf structure matches at exactly 2.000× — the volume is at twice the count-scale of voxel_count.csv (the csv counts are at half scale). The ratio being a single shared constant across all classes is the real proof of mapping consistency: a relabeling bug would skew classes by different amounts. The small dips below 2.0 for thin structures (ventricle 1.936) are half-scale partial-volume effects on the csv counts, not mapping errors.

  4. Priors are a valid probability distribution and recover the hard labels. Each prior0N.nii.gz lies in [0, 1]; the per-voxel sum is 1.000000 inside the support and 0 outside. Taking argmax over the 8 priors reproduces the hard classify.nii.gz label at 97.2% of supported labelled voxels and 99.98% of strong-prior interiors (>0.9) — the ~2.8% gap is expected boundary softening from the 2 mm-FWHM Gaussian, not a labeling error.


Coarser schemes: 4-class, 3-class, cbmerge

make_coarse.py derives reduced label sets by collapsing the 8 tissue classes of classify.nii.gz (no second ontology walk — the 9-class result is the source of truth). It regenerates, for each scheme, the LUTs, the relabeled volume, and the priors, reusing the identical 2 mm-FWHM DiscreteGaussian smoothing + per-voxel normalization as make_priors.py.

How each 9-class label folds in (brain stem → white matter by project decision; ventricles → CSF; cerebellar GM/WM merge into GM/WM):

9-class source → 4-class (CSF/GM/WM/DeepGM) → 3-class (CSF/GM/WM)
1 cerebrospinalFluid 1 CSF 1 CSF
4 ventricle 1 CSF 1 CSF
2 grayMatter 2 GM 2 GM
5 cerebellarGrayMatter 2 GM 2 GM
8 deepGrayMatter 4 DeepGM 2 GM
3 whiteMatter 3 WM 3 WM
6 cerebellarWhiteMatter 3 WM 3 WM
7 brainStem 3 WM 3 WM

A third scheme, cbmerge (7 classes), keeps the full 9-class granularity but merges cerebellar gray (5) + cerebellar white (6) into a single cerebellum label, renumbered contiguous: 1 CSF, 2 GM, 3 WM, 4 ventricle, 5 cerebellum, 6 brainStem, 7 deepGM. (All other classes unchanged; nothing else folds.)

All schemes verify the same way as the 9-class priors: labels in range, per-voxel prior sum 1.000000 in support / 0 outside → CONSISTENT. (For the 4-/3-class schemes, CSF and WM voxel coverage is identical; only GM vs DeepGM differs, via per-scheme renormalization.)

Outputs per scheme <tag> ∈ {4class, 3class, cbmerge}: map_<tag>.csv, tissue_map_<tag>.lut, tissue_map_<tag>_annotated.lut, classify_<tag>.nii.gz, and prior_<tag>_0N.nii.gz (Atropos-ready as -p prior_<tag>_%02d.nii.gz).


Reproducibility & customization

  • The entire mapping is defined by the ANCHORS dict at the top of classify_tissue.py. To change a decision, edit one line and rerun both scripts.
  • Reviewable defaults, called out explicitly so they are easy to revisit:
    • transient developmental structures (FTS, HTS) → background (0)
    • spinal cord (SpC) → background (0)
    • blood vessels (fbv, mbv, hbv) → background (0)
    • sulci/fissures (CeS, cbf) → CSF (1)
    • bare container nodes (neural plate/tube, brain, forebrain/mid/hindbrain) → background (0)
  • Because anchors are keyed by acronym and resolved at runtime (with a warning on any unresolved acronym), the pipeline is robust to id renumbering across atlas versions.

File manifest

voxel_count.csv             input  — Allen ontology (3,317 structures)
map.csv                     input  — 9 tissue classes
annotation_full.nii.gz      input  — label volume (Allen ids per voxel)
mni_icbm152_..._synthstrip_mask.nii.gz  input — brain mask (CSF source)
generate_all.sh             full build: download inputs + mask + run pipeline
requirements.txt            python deps (SimpleITK, numpy) for `uv run`
classify_tissue.py          builds the LUTs from the ontology hierarchy
apply_classification.py     relabels the volume + mask CSF fill via SimpleITK
make_priors.py              builds the 8 probabilistic priors via SimpleITK
make_coarse.py              derives the 4-class/3-class/cbmerge schemes (LUT/vol/priors)
make_figures.py             renders the images/ example-slice figures (matplotlib)
images/                     example-slice PNGs embedded in this README
tissue_map.lut              output — all 3,317 nodes  → tissue value
tissue_map_annotated.lut    output — 141 voxel-bearing → tissue value
tissue_map_background.lut   output — viz aid: background-class structures → 1
classify.nii.gz             output — 9-class tissue segmentation (uint8)
prior01..08.nii.gz          output — per-class probabilistic priors (float32)
map_<tag>.csv               output — coarse scheme defs (tag: 4class/3class/cbmerge)
tissue_map_<tag>[_annotated].lut  output — coarse id → tissue LUTs
classify_<tag>.nii.gz       output — coarse tissue segmentations (uint8)
prior_<tag>_0N.nii.gz       output — coarse per-class priors (float32)

About

Conversion of the AHRA to Tissue Classifications

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors