tissue-classify

Collapse a fine-grained Allen Developing Human Brain Atlas label volume into an 9-class tissue segmentation, and document exactly how every structure was assigned to a tissue type.

The atlas labels ~3,300 named neuroanatomical structures. For tissue-level work (segmentation priors, registration targets, volumetrics) those need to be reduced to a handful of tissue classes: gray matter, deep gray matter, white matter, ventricles, CSF, cerebellar gray/white, and brain stem. This repository derives that reduction deterministically from the atlas ontology itself and applies it to the label volume.

Inputs

The ontology and label volume come from the Allen Human Reference Atlas – 3D, 2020:

Announcement / overview: https://community.brain-map.org/t/allen-human-reference-atlas-3d-2020-new/405
Data download (version 1): https://download.alleninstitute.org/informatics-archive/allen_human_reference_atlas_3d_2020/version_1/

The T1 image the brain mask is derived from is the MNI ICBM152 2009b nonlinear symmetric template: http://www.bic.mni.mcgill.ca/~vfonov/icbm/2009/mni_icbm152_nlin_sym_09b_nifti.zip

File	What it is
`voxel_count.csv`	The Allen ontology: one row per structure (3,317 rows). Columns: `id, graph_order, acronym, name, color_hex_triplet, parent_structure_id, structure_id_path, annotated, voxel_count, subgraph_annotated, subgraph_voxel_count, volume_mm3, volume_cm3`.
`map.csv`	The 9 target tissue classes. Columns: `value, shortName, description`.
`annotation_full.nii.gz`	The label volume — each voxel holds an Allen structure `id`. `uint32`, 394×466×378, 0.5 mm isotropic.
`mni_icbm152_t1_tal_nlin_sym_09b_hires_synthstrip_mask.nii.gz`	SynthStrip brain mask on the same grid (1 = brain, 0 = non-brain). Used to source CSF (class 1) spatially — see below.

The 9 tissue classes (`map.csv`)

value	shortName	description
0	`background`	unlabelled voxels
1	`cerebrospinalFluid`	CSF excluding the ventricles (outside brain)
2	`grayMatter`	gray matter tissue
3	`whiteMatter`	white matter tissue
4	`ventricle`	ventricles (CSF inside the brain)
5	`cerebellarGrayMatter`	gray matter tissue in the cerebellum
6	`cerebellarWhiteMatter`	white matter tissue in the cerebellum
7	`brainStem`	brain stem
8	`deepGrayMatter`	deep gray matter (subcortical/diencephalic nuclei)

Ontology provenance & reference documentation

The ontology in voxel_count.csv is the Allen Brain Map StructureGraph used by the live API. The API tags these structures graph_id = 16, ontology_id = 11.

Only the appended per-volume columns (annotated, voxel_count, subgraph_voxel_count, volume_mm3, volume_cm3) are extra; the hierarchy the classifier walks (structure_id_path, acronyms) is 1:1 with the API. The whole graph can be pulled live:

curl "https://api.brain-map.org/api/v2/data/query.json?criteria=model::Structure,rma::criteria,\[graph_id\$eq16\]&num_rows=5000"

Reference documentation for the hierarchy:

Resource	What it covers
`README.pdf` (ships beside the inputs)	Authoritative doc for this atlas: the 141-structure annotation volume, the hierarchical ontology, and the ITK-SNAP label files.
Atlas Drawings and Ontologies — Allen API	How a StructureGraph works: each structure (except `root`) has one parent = a "part-of" edge — the model behind `structure_id_path`. Ontology downloadable as hierarchical JSON.
Allen Atlas Viewer	Browse the structure tree interactively.
Brain Map community thread	Release announcement and Q&A.
Human Reference Atlas ontologies paper (PMC10043028)	Peer-reviewed context on the specimen/structure/spatial ontologies.

Mapping labels to tissue classes

voxel_count.csv and map.csv share no common column. Nothing in the atlas row for "precentral gyrus" says it is gray matter; nothing says the cerebral aqueduct is a ventricle. The mapping has to be inferred from anatomy.

The Allen ontology already encodes the tissue divisions in its own tree structure. Every major brain division splits into identically-named sub-branches:

brain
├── forebrain (F)
│   ├── gray matter of forebrain        (FGM)
│   ├── white matter of forebrain       (FWM)
│   ├── ventricles of forebrain         (FV)
│   ├── surface structures of forebrain (FSS)   ← gyri, sulci, surface nuclei
│   ├── transient structures (FTS)
│   └── blood vessels (fbv)
├── midbrain (M)
│   ├── gray matter of midbrain  (MGM)
│   ├── white matter of midbrain (MWM)
│   ├── ventricle of midbrain    (MV)
│   ├── surface structures (MSS)
│   └── blood vessels (mbv)
└── hindbrain (H)
    ├── gray matter of the hindbrain (HGM)
    │   ├── metencephalon (Met)
    │   │   ├── cerebellum (CB) ── cerebellar cortex + deep nuclei
    │   │   └── pons (Pn)
    │   └── myelencephalon / medulla (Mo)
    ├── white matter of hindbrain (HWM)   ← fiber tracts, cerebellar peduncles
    ├── ventricles of hindbrain (HV)
    ├── surface structures (HSS)
    │   └── surface structures of cerebellum (CbSS) ── lobules + fissures
    └── blood vessels (hbv)

So tissue type can be read off which sub-branch a structure lives in.

Classification method

Hierarchy is read from `structure_id_path`

Each row carries its full ancestry as a path, e.g.

/10153/10154/10155/10156/10157/10158/10159/10313/10327/10329/

This is the authoritative parent chain. (The parent_structure_id column is float-formatted, e.g. 10153.0, and does not key cleanly against the integer id, so it is not used.)

Anchors + "deepest anchor wins"

A small set of branch nodes are designated anchors, each tagged with a tissue value. To classify any structure, walk its structure_id_path from the structure itself upward toward the root and take the tissue value of the first anchor encountered (i.e. the deepest / most specific anchor on the path). If no anchor is on the path, the structure is background (0).

"Deepest wins" is what makes nested overrides work automatically. For example the cerebellum (CB) sits inside "gray matter of the hindbrain" (HGM). HGM is an anchor → brain stem (7), but CB is a deeper anchor → cerebellar gray (5). A cerebellar leaf hits CB before HGM, so it correctly resolves to 5, not 7.

Anchors are keyed by acronym and resolved to ids at runtime, so the logic survives id changes in future atlas releases.

The complete anchor table

29 anchors, exactly as defined in classify_tissue.py:

Anchor	id	Structure name	→ tissue	value	Rationale
`FGM`	10157	gray matter of forebrain	grayMatter	2	Default for forebrain gray = cerebral cortex. The deep gray nuclei below override to `deepGrayMatter(8)`.
`CN`	10331	cerebral nuclei	deepGrayMatter	8	Basal ganglia (caudate, putamen, globus pallidus, nucleus accumbens), amygdala, claustrum, basal forebrain, septum, BNST. Deeper than `FGM` → overrides.
`THM`	10390	thalamus	deepGrayMatter	8	All thalamic nuclei + habenula, pineal, zona incerta. Deeper than `FGM` → overrides.
`SubTH`	10465	subthalamus	deepGrayMatter	8	Subthalamic nucleus. Deeper than `FGM` → overrides.
`HTH`	10467	hypothalamus	deepGrayMatter	8	Preoptic/supraoptic/tuberal/mammillary regions. Deeper than `FGM` → overrides.
`FWM`	10557	white matter of forebrain	whiteMatter	3	Fiber tracts: corpus callosum, internal capsule, fornix, etc.
`FV`	10595	ventricles of forebrain	ventricle	4	Lateral + third ventricles and their horns.
`FSS`	10609	surface structures of forebrain	grayMatter	2	Default for the surface branch — dominated by cortical gyri/lobules (the macroscopic gray-matter parcellation) plus surface gray nuclei (mammillary body, etc.). Overridden below for non-gray members.
`CeS`	10610	cerebral sulci	cerebrospinalFluid	1	Sulci & fissures (central sulcus, Sylvian/longitudinal/calcarine fissures…) are the infolded subarachnoid CSF spaces. Deeper than `FSS`, so it overrides.
`ASFV`	146034908	adjoining structures of forebrain ventricles	cerebrospinalFluid	1	Peri-ventricular CSF-adjacent structures. Overrides `FSS`.
`fbv`	266441685	blood vessels of forebrain	background	0	Vasculature is not one of the 9 tissue classes.
`FTS`	10506	transient structures of forebrain	background	0	Transient developmental structures (germinal zones, subplate) have no tissue class. Reviewable default.
`MGM`	10649	gray matter of midbrain	brainStem	7	Midbrain is part of the brain stem; `map.csv` has no separate midbrain GM/WM, so the whole midbrain collapses to brainStem.
`MWM`	10650	white matter of midbrain	brainStem	7	Same — midbrain fiber tracts are brain stem.
`MV`	10651	ventricle of midbrain	ventricle	4	Cerebral aqueduct.
`MSS`	10652	surface structures of midbrain	brainStem	7	Tectal surface, colliculi, nerve roots — brain stem surface.
`mbv`	266441705	blood vessels of midbrain	background	0	Vasculature.
`HGM`	10654	gray matter of the hindbrain	brainStem	7	Default for hindbrain gray = pons + medulla → brain stem. Cerebellum (deeper) overrides.
`HWM`	10668	white matter of hindbrain	cerebellarWhiteMatter	6	Catch-all hindbrain WM whose bulk voxels are the cerebellar internal white matter (arbor vitae) — `CB` is filed under the gray branch (`HGM`), so its internal WM has no node and lands here. Spatial check: 100% of `HWM` voxels lie inside the cerebellum bbox, 85.5% closer to the cerebellum than the brainstem. Its peduncle children (`icp`, `mcp`) inherit 6.
`HV`	10669	ventricles of hindbrain	ventricle	4	Fourth ventricle, central canal.
`HSS`	10670	surface structures of hindbrain	brainStem	7	Default for hindbrain surface = pons/medulla surface. Cerebellar surface (deeper) overrides.
`hbv`	266441737	blood vessels of hindbrain	background	0	Vasculature.
`HTS`	10663	transient structures of hindbrain	background	0	Transient developmental structures. Reviewable default.
`CB`	10656	cerebellum	cerebellarGrayMatter	5	Cerebellar cortex + deep nuclei (both gray). Deeper than `HGM` → overrides brainStem.
`CbSS`	12827	surface structures of cerebellum	cerebellarGrayMatter	5	Cerebellar lobes/lobules (gray). Deeper than `HSS` → overrides brainStem.
`cbf`	12828	cerebellar fissures	cerebrospinalFluid	1	Cerebellar fissures = CSF. Deeper than `CbSS` → overrides cerebellar gray.
`scp`	12354	superior cerebellar peduncle	cerebellarWhiteMatter	6	Cerebellar WM tract filed under `MWM` (midbrain). Deeper than `MWM` → overrides brainStem.
`xscp`	12337	decussation of superior cerebellar peduncle	cerebellarWhiteMatter	6	Same; 0 voxels in this volume but anatomically cerebellar WM. (`icp`/`mcp` under `HWM` inherit 6, no anchor needed.)
`SpC`	12890	spinal cord	background	0	Not a brain tissue class. Reviewable default.

Default (no anchor on path): background (0). This catches the abstract container nodes above the GM/WM/V split — neural plate, neural tube, brain, forebrain, midbrain, hindbrain — which never appear as voxel labels.

Worked resolution examples

Structure (acronym, path tail)	Anchors hit (leaf→root)	Result
precentral gyrus (`PrCG`)	`FSS`	gray (2)
putamen (`Pu`)	`CN` → `FGM`	deep gray (8)
thalamus (`Pul`)	`THM` → `FGM`	deep gray (8)
head of hippocampus (`HiH`)	`FSS`	gray (2)
longitudinal fissure (`los`)	`CeS` → `FSS`	CSF (1)
corpus callosum (`cc`)	`FWM`	white (3)
third ventricle (`3V`)	`FV`	ventricle (4)
cerebellar lobule (under `CBLL`)	`CbSS` → `HSS`	cerebellar gray (5)
cerebellar fissure (under `cbf`)	`cbf` → `CbSS` → `HSS`	CSF (1)
pons tegmentum (`PnTg`)	`HGM` (via Met/Pn)	brain stem (7)
red nucleus (`RN`, midbrain)	`MGM`/`MWM`	brain stem (7)
spinal cord leaf	`SpC`	background (0)

Key design decisions

Midbrain + pons + medulla → a single brainStem (7). map.csv defines no separate brain-stem gray/white, and lists cerebellar gray/white as the only posterior-fossa subdivisions. So the entire midbrain and the non-cerebellar hindbrain collapse to brain stem, regardless of their internal gray/white organization. (Their ventricles still go to ventricle (4), since those branches are siblings of the GM/WM nodes, never nested under them.)
Gyri are gray matter; sulci/fissures are CSF. The FSS "surface structures" branch is a macroscopic landmark parcellation containing both the cortical gyri (CeG, gray) and the sulci/fissures (CeS, the CSF-filled subarachnoid clefts). They are split accordingly. This is the one place where a naive "whole branch → one tissue" rule fails, and the deepest-wins override (CeS → CSF inside FSS → gray) handles it.
Deep gray nuclei → deepGrayMatter (8). The subcortical/diencephalic nuclei — basal ganglia, amygdala, thalamus, hypothalamus, subthalamic nucleus — are siblings of cortex under FGM. The four branch anchors CN/THM/SubTH/ HTH (deeper than FGM) pull them into class 8, leaving cortex in class 2. Scope is deliberately forebrain only: the midbrain deep nuclei (substantia nigra SN, red nucleus RN) stay brainStem (7) to keep the "midbrain = brain stem" rule intact, and the cerebellar deep nuclei (CbDN) stay cerebellarGrayMatter (5). The hippocampus stays grayMatter (2): the atlas files it under the surface branch FSS (allocortex), not under CN, so the deep-gray anchors never reach it.

CSF (class 1) is sourced from the brain mask, not the ontology

The Allen annotation carries no CSF voxels: 106 sulci/fissure structures map to class 1 in the ontology, but none of them are segmented in annotation_full.nii.gz. Subarachnoid/sulcal CSF is therefore defined spatially instead, in apply_classification.py, using the SynthStrip brain mask:

every voxel that is inside the brain mask (mask == 1) but left unlabelled by the annotation (classify == 0) becomes CSF (1).

The mask was generated with SynthStrip:

synthstrip -i mni_icbm152_t1_tal_nlin_sym_09b_hires.nii.gz \
           -m mni_icbm152_t1_tal_nlin_sym_09b_hires_synthstrip_mask.nii.gz

These are exactly the CSF-filled clefts the parenchymal atlas does not cover. The fill adds 2,811,582 CSF voxels (only 1,060 annotation voxels fall outside the mask). After this step every class 1–8 is populated in the volume.

`6 cerebellarWhiteMatter` — cerebellar internal WM + peduncles

The atlas has no cerebellar-white-matter node under the cerebellum (CB divides only into cortex CBC and deep nuclei CbDN, both gray; there is no arbor-vitae / internal-WM node). The cerebellar white matter is instead recovered from two places the ontology files outside CB:

HWM (white matter of hindbrain) — the bulk. This catch-all's voxels are the cerebellar internal white matter (arbor vitae), because CB is filed under the gray branch (HGM) and its internal WM has no dedicated node. A spatial check on annotation_full.nii.gz confirms it: 100% of HWM voxels lie inside the cerebellum bounding box, 85.5% are closer to the cerebellum centroid than to the brainstem, and the HWM centroid (z,y,x)=(69,152,196) nearly coincides with the cerebellum-gray centroid (71,142,196). So HWM → 6.

The three cerebellar peduncles, filed under midbrain/hindbrain WM:

peduncle	acronym	id	voxels (csv)	filed under
superior cerebellar peduncle	`scp`	12354	2,338	`MWM` (midbrain WM)
inferior cerebellar peduncle	`icp`	12741	1,592	`HWM` (hindbrain WM)
middle cerebellar peduncle	`mcp`	12768	10,905	`HWM` (hindbrain WM)
decussation of scp	`xscp`	12337	0	`MWM`

scp/xscp (under MWM) are explicit anchors → 6 (deeper than MWM, so deepest-wins overrides brainStem). icp/mcp simply inherit HWM's 6.

Together these give 143,819 csv voxels / 287,566 in the volume for class 6 (cerebellar gray:white ≈ 3.9:1, anatomically reasonable).

Residual trade-off. HWM is one atomic label in the volume, so the ~14.5% of its voxels near the pons (transitional pontocerebellar / MCP-entry fibers) also go to class 6. This is acceptable — they are cerebellar-bound fibers — and far better than the alternative, which would mislabel the 85.5% cerebellar core as brainstem. The remaining class 7 (midbrain + pons + medulla gray + their named tracts, 139,773 csv voxels) is the true brainstem.

Class distribution

value	tissue	structures (all 3,317)	structures (141 voxel-bearing)
0	background	788	0
1	cerebrospinalFluid	106	0
2	grayMatter	1,198	57
3	whiteMatter	140	8
4	ventricle	23	9
5	cerebellarGrayMatter	86	4
6	cerebellarWhiteMatter	91	4
7	brainStem	530	12
8	deepGrayMatter	355	47

Only 141 of the 3,317 ontology nodes are annotated=True (carry voxels); the rest are abstract intermediate nodes. Every one of the 141 voxel-bearing structures resolves to a real tissue class (1–8) — none fall to background, which is the primary correctness check. (Class 1 / CSF has 0 voxel-bearing structures because its voxels come from the brain-mask fill, not the ontology.)

Pipeline & usage

One command builds everything from scratch — download → brain mask → relabel → priors:

./generate_all.sh

It downloads the external inputs, runs SynthStrip for the brain mask (skipped if the mask is already present; override with SYNTHSTRIP=<cmd>), then runs all four stages below. The Python deps (SimpleITK + numpy, declared in requirements.txt) are pulled in per stage by uv run --with-requirements, so no virtual environment is managed by hand. Inputs and mask are reused if present; the pipeline stages always re-run.

Requirements: curl, unzip, gzip, uv, and SynthStrip (only if the mask is not already present).

To run the stages by hand instead:

# Stage 0 — download the external inputs (Allen ontology + volume, MNI T1)
./generate_all.sh   # (also runs stages 1–4; see above)

# Stage 1 — build the lookup tables (stdlib only)
python3 classify_tissue.py

# Stage 2 — relabel the volume + spatial CSF fill (SimpleITK)
uv run --with-requirements requirements.txt python apply_classification.py

# Stage 3 — probabilistic tissue priors for ANTs Atropos (SimpleITK)
uv run --with-requirements requirements.txt python make_priors.py

# Stage 4 — coarser schemes (4-class, 3-class, cbmerge): volumes + priors
uv run --with-requirements requirements.txt python make_coarse.py

`classify_tissue.py`

Reads voxel_count.csv with csv.DictReader (the name column contains commas — a real CSV parser is required).
Resolves the 29 anchor acronyms to ids and builds the anchor→value table.
Classifies every structure by the deepest-anchor-on-path rule.
Writes the LUTs, numerically sorted by id: tissue_map.lut (all nodes), tissue_map_annotated.lut (voxel-bearing nodes), and tissue_map_background.lut (a visualization aid mapping background-class structures → 1, else 0).
Prints an audit: per-class counts, and flags any voxel-bearing structure that fell to background (would indicate a misclassification to review).

`apply_classification.py`

Loads tissue_map.lut (the all-nodes superset) as the single source of truth — the relabeling logic is not duplicated here.
Reads annotation_full.nii.gz into a numpy array.
Sparse vectorized remap. Ids reach ~2.7 × 10⁸, so a dense lookup array is avoided. Instead: u = np.unique(arr), build the small vals array for just those ids, then out = vals[np.searchsorted(u, arr)]. Voxels with id 0 or any id absent from the LUT → background.
Spatial CSF fill. Reads the SynthStrip brain mask (grid asserted to match) and sets out[(mask == 1) & (out == 0)] = 1, sourcing the CSF class the annotation does not carry (see "CSF (class 1)" above).
Writes classify.nii.gz as uint8, calling CopyInformation to preserve spacing, origin and direction.
Runs the verification below.

`make_priors.py`

Converts the hard classify.nii.gz into soft spatial priors:

For each tissue class 1–8, builds a binary mask and blurs it with a 2 mm FWHM Gaussian (sigma = FWHM / (2·√(2·ln2)) = 0.8493 mm). Uses sitk.DiscreteGaussian (a non-negative separable kernel, spacing-aware)
Normalizes across classes per voxel: prior_c = smoothed_c / Σ smoothed wherever the total support exceeds 1e-4; elsewhere all priors are 0. The result is a proper per-voxel probability distribution over the 8 classes.
Writes prior01.nii.gz … prior08.nii.gz (float32, 0–1), priorNN ↔ class NN, preserving geometry via CopyInformation.

Worked example (one axial slice)

The figures below are all the same axial slice, rendered by make_figures.py (run it after a build: uv run --with-requirements requirements.txt --with matplotlib python make_figures.py).

Stage 0 — inputs: the MNI T1 (the SynthStrip mask is derived from this) and the Allen annotation volume (each colour = a distinct structure id):

Stage 1 builds the text LUTs (tissue_map*.lut); stage 2 applies them to the annotation and adds the mask-sourced CSF, giving the 9-class segmentation:

Stage 3 — the eight probabilistic priors (smoothed, normalized to 0–1). At this slice prior06 (cerebellar WM) is nearly empty, as expected:

Stage 4 — the coarser schemes. cbmerge keeps brain stem (brown), cerebellum (purple) and deep gray (pink) distinct; 4-class/3-class fold further:

Outputs

File	Format	Contents
`tissue_map.lut`	`id value`, space-separated, 3,317 rows	All ontology nodes → tissue value.
`tissue_map_annotated.lut`	`id value`, space-separated, 141 rows	Only the voxel-bearing structures.
`classify.nii.gz`	NIfTI, `uint8`, values 0–8	The tissue segmentation (same grid as the input volume), incl. mask-sourced CSF.
`prior01.nii.gz … prior08.nii.gz`	NIfTI, `float32`, 0–1	Per-class probabilistic priors (`priorNN` ↔ class NN), Atropos-ready.

classify_tissue.py also writes tissue_map_background.lut (a visualization aid: background-class structures → 1, else 0). The coarser schemes add classify_<tag>.nii.gz and prior_<tag>_0N.nii.gz — see Coarser schemes.

Example LUT rows:

10329 2
12369 4
12384 5

Verification

Four independent checks, all passing:

No voxel-bearing structure falls to background. All 141 annotated=True structures resolve to a tissue class 1–8.
The output volume contains only valid tissue values. np.unique(classify.nii.gz) == [0, 1, 2, 3, 4, 5, 6, 7, 8] — every class 1–8 is populated (class 1 via the brain-mask fill, the rest from the annotation).
Per-structure voxel-count cross-check against voxel_count.csv. Aggregating the csv voxel_count column by tissue value and comparing to the actual voxel counts in classify.nii.gz:
```
class                    volume voxels   csv voxels   ratio
2 grayMatter                 7144564      3572282   2.000
3 whiteMatter                4393397      2207402   1.990
4 ventricle                   233697       120733   1.936
5 cerebellarGrayMatter       1115998       557999   2.000
6 cerebellarWhiteMatter        287566       143819   1.999
7 brainStem                   275416       139773   1.970
8 deepGrayMatter              512772       257193   1.994
```
Every leaf structure matches at exactly 2.000× — the volume is at twice the count-scale of voxel_count.csv (the csv counts are at half scale). The ratio being a single shared constant across all classes is the real proof of mapping consistency: a relabeling bug would skew classes by different amounts. The small dips below 2.0 for thin structures (ventricle 1.936) are half-scale partial-volume effects on the csv counts, not mapping errors.
Priors are a valid probability distribution and recover the hard labels. Each prior0N.nii.gz lies in [0, 1]; the per-voxel sum is 1.000000 inside the support and 0 outside. Taking argmax over the 8 priors reproduces the hard classify.nii.gz label at 97.2% of supported labelled voxels and 99.98% of strong-prior interiors (>0.9) — the ~2.8% gap is expected boundary softening from the 2 mm-FWHM Gaussian, not a labeling error.

Coarser schemes: 4-class, 3-class, cbmerge

make_coarse.py derives reduced label sets by collapsing the 8 tissue classes of classify.nii.gz (no second ontology walk — the 9-class result is the source of truth). It regenerates, for each scheme, the LUTs, the relabeled volume, and the priors, reusing the identical 2 mm-FWHM DiscreteGaussian smoothing + per-voxel normalization as make_priors.py.

How each 9-class label folds in (brain stem → white matter by project decision; ventricles → CSF; cerebellar GM/WM merge into GM/WM):

9-class source	→ 4-class (CSF/GM/WM/DeepGM)	→ 3-class (CSF/GM/WM)
1 cerebrospinalFluid	1 CSF	1 CSF
4 ventricle	1 CSF	1 CSF
2 grayMatter	2 GM	2 GM
5 cerebellarGrayMatter	2 GM	2 GM
8 deepGrayMatter	4 DeepGM	2 GM
3 whiteMatter	3 WM	3 WM
6 cerebellarWhiteMatter	3 WM	3 WM
7 brainStem	3 WM	3 WM

A third scheme, cbmerge (7 classes), keeps the full 9-class granularity but merges cerebellar gray (5) + cerebellar white (6) into a single cerebellum label, renumbered contiguous: 1 CSF, 2 GM, 3 WM, 4 ventricle, 5 cerebellum, 6 brainStem, 7 deepGM. (All other classes unchanged; nothing else folds.)

All schemes verify the same way as the 9-class priors: labels in range, per-voxel prior sum 1.000000 in support / 0 outside → CONSISTENT. (For the 4-/3-class schemes, CSF and WM voxel coverage is identical; only GM vs DeepGM differs, via per-scheme renormalization.)

Outputs per scheme <tag> ∈ {4class, 3class, cbmerge}: map_<tag>.csv, tissue_map_<tag>.lut, tissue_map_<tag>_annotated.lut, classify_<tag>.nii.gz, and prior_<tag>_0N.nii.gz (Atropos-ready as -p prior_<tag>_%02d.nii.gz).

Reproducibility & customization

The entire mapping is defined by the ANCHORS dict at the top of classify_tissue.py. To change a decision, edit one line and rerun both scripts.
Reviewable defaults, called out explicitly so they are easy to revisit:
- transient developmental structures (FTS, HTS) → background (0)
- spinal cord (SpC) → background (0)
- blood vessels (fbv, mbv, hbv) → background (0)
- sulci/fissures (CeS, cbf) → CSF (1)
- bare container nodes (neural plate/tube, brain, forebrain/mid/hindbrain) → background (0)
Because anchors are keyed by acronym and resolved at runtime (with a warning on any unresolved acronym), the pipeline is robust to id renumbering across atlas versions.

File manifest

voxel_count.csv             input  — Allen ontology (3,317 structures)
map.csv                     input  — 9 tissue classes
annotation_full.nii.gz      input  — label volume (Allen ids per voxel)
mni_icbm152_..._synthstrip_mask.nii.gz  input — brain mask (CSF source)
generate_all.sh             full build: download inputs + mask + run pipeline
requirements.txt            python deps (SimpleITK, numpy) for `uv run`
classify_tissue.py          builds the LUTs from the ontology hierarchy
apply_classification.py     relabels the volume + mask CSF fill via SimpleITK
make_priors.py              builds the 8 probabilistic priors via SimpleITK
make_coarse.py              derives the 4-class/3-class/cbmerge schemes (LUT/vol/priors)
make_figures.py             renders the images/ example-slice figures (matplotlib)
images/                     example-slice PNGs embedded in this README
tissue_map.lut              output — all 3,317 nodes  → tissue value
tissue_map_annotated.lut    output — 141 voxel-bearing → tissue value
tissue_map_background.lut   output — viz aid: background-class structures → 1
classify.nii.gz             output — 9-class tissue segmentation (uint8)
prior01..08.nii.gz          output — per-class probabilistic priors (float32)
map_<tag>.csv               output — coarse scheme defs (tag: 4class/3class/cbmerge)
tissue_map_<tag>[_annotated].lut  output — coarse id → tissue LUTs
classify_<tag>.nii.gz       output — coarse tissue segmentations (uint8)
prior_<tag>_0N.nii.gz       output — coarse per-class priors (float32)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

tissue-classify

Inputs

The 9 tissue classes (`map.csv`)

Ontology provenance & reference documentation

Mapping labels to tissue classes

Classification method

Hierarchy is read from `structure_id_path`

Anchors + "deepest anchor wins"

The complete anchor table

Worked resolution examples

Key design decisions

CSF (class 1) is sourced from the brain mask, not the ontology

`6 cerebellarWhiteMatter` — cerebellar internal WM + peduncles

Class distribution

Pipeline & usage

`classify_tissue.py`

`apply_classification.py`

`make_priors.py`

Worked example (one axial slice)

Outputs

Verification

Coarser schemes: 4-class, 3-class, cbmerge

Reproducibility & customization

File manifest

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
images		images
.gitignore		.gitignore
README.md		README.md
apply_classification.py		apply_classification.py
classify_tissue.py		classify_tissue.py
generate_all.sh		generate_all.sh
make_coarse.py		make_coarse.py
make_figures.py		make_figures.py
make_priors.py		make_priors.py
map.csv		map.csv
requirements.txt		requirements.txt

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

tissue-classify

Inputs

The 9 tissue classes (map.csv)

Ontology provenance & reference documentation

Mapping labels to tissue classes

Classification method

Hierarchy is read from structure_id_path

Anchors + "deepest anchor wins"

The complete anchor table

Worked resolution examples

Key design decisions

CSF (class 1) is sourced from the brain mask, not the ontology

6 cerebellarWhiteMatter — cerebellar internal WM + peduncles

Class distribution

Pipeline & usage

classify_tissue.py

apply_classification.py

make_priors.py

Worked example (one axial slice)

Outputs

Verification

Coarser schemes: 4-class, 3-class, cbmerge

Reproducibility & customization

File manifest

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

The 9 tissue classes (`map.csv`)

Hierarchy is read from `structure_id_path`

`6 cerebellarWhiteMatter` — cerebellar internal WM + peduncles

`classify_tissue.py`

`apply_classification.py`

`make_priors.py`

Packages