Collapse a fine-grained Allen Developing Human Brain Atlas label volume into an 9-class tissue segmentation, and document exactly how every structure was assigned to a tissue type.
The atlas labels ~3,300 named neuroanatomical structures. For tissue-level work (segmentation priors, registration targets, volumetrics) those need to be reduced to a handful of tissue classes: gray matter, deep gray matter, white matter, ventricles, CSF, cerebellar gray/white, and brain stem. This repository derives that reduction deterministically from the atlas ontology itself and applies it to the label volume.
The ontology and label volume come from the Allen Human Reference Atlas – 3D, 2020:
- Announcement / overview: https://community.brain-map.org/t/allen-human-reference-atlas-3d-2020-new/405
- Data download (version 1): https://download.alleninstitute.org/informatics-archive/allen_human_reference_atlas_3d_2020/version_1/
The T1 image the brain mask is derived from is the MNI ICBM152 2009b nonlinear symmetric template: http://www.bic.mni.mcgill.ca/~vfonov/icbm/2009/mni_icbm152_nlin_sym_09b_nifti.zip
| File | What it is |
|---|---|
voxel_count.csv |
The Allen ontology: one row per structure (3,317 rows). Columns: id, graph_order, acronym, name, color_hex_triplet, parent_structure_id, structure_id_path, annotated, voxel_count, subgraph_annotated, subgraph_voxel_count, volume_mm3, volume_cm3. |
map.csv |
The 9 target tissue classes. Columns: value, shortName, description. |
annotation_full.nii.gz |
The label volume — each voxel holds an Allen structure id. uint32, 394×466×378, 0.5 mm isotropic. |
mni_icbm152_t1_tal_nlin_sym_09b_hires_synthstrip_mask.nii.gz |
SynthStrip brain mask on the same grid (1 = brain, 0 = non-brain). Used to source CSF (class 1) spatially — see below. |
| value | shortName | description |
|---|---|---|
| 0 | background |
unlabelled voxels |
| 1 | cerebrospinalFluid |
CSF excluding the ventricles (outside brain) |
| 2 | grayMatter |
gray matter tissue |
| 3 | whiteMatter |
white matter tissue |
| 4 | ventricle |
ventricles (CSF inside the brain) |
| 5 | cerebellarGrayMatter |
gray matter tissue in the cerebellum |
| 6 | cerebellarWhiteMatter |
white matter tissue in the cerebellum |
| 7 | brainStem |
brain stem |
| 8 | deepGrayMatter |
deep gray matter (subcortical/diencephalic nuclei) |
The ontology in voxel_count.csv is the Allen Brain Map StructureGraph used
by the live API. The API tags these structures
graph_id = 16, ontology_id = 11.
Only the appended per-volume columns (annotated, voxel_count,
subgraph_voxel_count, volume_mm3, volume_cm3) are extra; the hierarchy the
classifier walks (structure_id_path, acronyms) is 1:1 with the API. The whole
graph can be pulled live:
curl "https://api.brain-map.org/api/v2/data/query.json?criteria=model::Structure,rma::criteria,\[graph_id\$eq16\]&num_rows=5000"Reference documentation for the hierarchy:
| Resource | What it covers |
|---|---|
README.pdf (ships beside the inputs) |
Authoritative doc for this atlas: the 141-structure annotation volume, the hierarchical ontology, and the ITK-SNAP label files. |
| Atlas Drawings and Ontologies — Allen API | How a StructureGraph works: each structure (except root) has one parent = a "part-of" edge — the model behind structure_id_path. Ontology downloadable as hierarchical JSON. |
| Allen Atlas Viewer | Browse the structure tree interactively. |
| Brain Map community thread | Release announcement and Q&A. |
| Human Reference Atlas ontologies paper (PMC10043028) | Peer-reviewed context on the specimen/structure/spatial ontologies. |
voxel_count.csv and map.csv share no common column. Nothing in the atlas
row for "precentral gyrus" says it is gray matter; nothing says the cerebral
aqueduct is a ventricle. The mapping has to be inferred from anatomy.
The Allen ontology already encodes the tissue divisions in its own tree structure. Every major brain division splits into identically-named sub-branches:
brain
├── forebrain (F)
│ ├── gray matter of forebrain (FGM)
│ ├── white matter of forebrain (FWM)
│ ├── ventricles of forebrain (FV)
│ ├── surface structures of forebrain (FSS) ← gyri, sulci, surface nuclei
│ ├── transient structures (FTS)
│ └── blood vessels (fbv)
├── midbrain (M)
│ ├── gray matter of midbrain (MGM)
│ ├── white matter of midbrain (MWM)
│ ├── ventricle of midbrain (MV)
│ ├── surface structures (MSS)
│ └── blood vessels (mbv)
└── hindbrain (H)
├── gray matter of the hindbrain (HGM)
│ ├── metencephalon (Met)
│ │ ├── cerebellum (CB) ── cerebellar cortex + deep nuclei
│ │ └── pons (Pn)
│ └── myelencephalon / medulla (Mo)
├── white matter of hindbrain (HWM) ← fiber tracts, cerebellar peduncles
├── ventricles of hindbrain (HV)
├── surface structures (HSS)
│ └── surface structures of cerebellum (CbSS) ── lobules + fissures
└── blood vessels (hbv)
So tissue type can be read off which sub-branch a structure lives in.
Each row carries its full ancestry as a path, e.g.
/10153/10154/10155/10156/10157/10158/10159/10313/10327/10329/
This is the authoritative parent chain. (The parent_structure_id column is
float-formatted, e.g. 10153.0, and does not key cleanly against the integer
id, so it is not used.)
A small set of branch nodes are designated anchors, each tagged with a tissue
value. To classify any structure, walk its structure_id_path from the structure
itself upward toward the root and take the tissue value of the first anchor
encountered (i.e. the deepest / most specific anchor on the path). If no
anchor is on the path, the structure is background (0).
"Deepest wins" is what makes nested overrides work automatically. For example the
cerebellum (CB) sits inside "gray matter of the hindbrain" (HGM). HGM is an
anchor → brain stem (7), but CB is a deeper anchor → cerebellar gray (5). A
cerebellar leaf hits CB before HGM, so it correctly resolves to 5, not 7.
Anchors are keyed by acronym and resolved to ids at runtime, so the logic survives id changes in future atlas releases.
29 anchors, exactly as defined in classify_tissue.py:
| Anchor | id | Structure name | → tissue | value | Rationale |
|---|---|---|---|---|---|
FGM |
10157 | gray matter of forebrain | grayMatter | 2 | Default for forebrain gray = cerebral cortex. The deep gray nuclei below override to deepGrayMatter(8). |
CN |
10331 | cerebral nuclei | deepGrayMatter | 8 | Basal ganglia (caudate, putamen, globus pallidus, nucleus accumbens), amygdala, claustrum, basal forebrain, septum, BNST. Deeper than FGM → overrides. |
THM |
10390 | thalamus | deepGrayMatter | 8 | All thalamic nuclei + habenula, pineal, zona incerta. Deeper than FGM → overrides. |
SubTH |
10465 | subthalamus | deepGrayMatter | 8 | Subthalamic nucleus. Deeper than FGM → overrides. |
HTH |
10467 | hypothalamus | deepGrayMatter | 8 | Preoptic/supraoptic/tuberal/mammillary regions. Deeper than FGM → overrides. |
FWM |
10557 | white matter of forebrain | whiteMatter | 3 | Fiber tracts: corpus callosum, internal capsule, fornix, etc. |
FV |
10595 | ventricles of forebrain | ventricle | 4 | Lateral + third ventricles and their horns. |
FSS |
10609 | surface structures of forebrain | grayMatter | 2 | Default for the surface branch — dominated by cortical gyri/lobules (the macroscopic gray-matter parcellation) plus surface gray nuclei (mammillary body, etc.). Overridden below for non-gray members. |
CeS |
10610 | cerebral sulci | cerebrospinalFluid | 1 | Sulci & fissures (central sulcus, Sylvian/longitudinal/calcarine fissures…) are the infolded subarachnoid CSF spaces. Deeper than FSS, so it overrides. |
ASFV |
146034908 | adjoining structures of forebrain ventricles | cerebrospinalFluid | 1 | Peri-ventricular CSF-adjacent structures. Overrides FSS. |
fbv |
266441685 | blood vessels of forebrain | background | 0 | Vasculature is not one of the 9 tissue classes. |
FTS |
10506 | transient structures of forebrain | background | 0 | Transient developmental structures (germinal zones, subplate) have no tissue class. Reviewable default. |
MGM |
10649 | gray matter of midbrain | brainStem | 7 | Midbrain is part of the brain stem; map.csv has no separate midbrain GM/WM, so the whole midbrain collapses to brainStem. |
MWM |
10650 | white matter of midbrain | brainStem | 7 | Same — midbrain fiber tracts are brain stem. |
MV |
10651 | ventricle of midbrain | ventricle | 4 | Cerebral aqueduct. |
MSS |
10652 | surface structures of midbrain | brainStem | 7 | Tectal surface, colliculi, nerve roots — brain stem surface. |
mbv |
266441705 | blood vessels of midbrain | background | 0 | Vasculature. |
HGM |
10654 | gray matter of the hindbrain | brainStem | 7 | Default for hindbrain gray = pons + medulla → brain stem. Cerebellum (deeper) overrides. |
HWM |
10668 | white matter of hindbrain | cerebellarWhiteMatter | 6 | Catch-all hindbrain WM whose bulk voxels are the cerebellar internal white matter (arbor vitae) — CB is filed under the gray branch (HGM), so its internal WM has no node and lands here. Spatial check: 100% of HWM voxels lie inside the cerebellum bbox, 85.5% closer to the cerebellum than the brainstem. Its peduncle children (icp, mcp) inherit 6. |
HV |
10669 | ventricles of hindbrain | ventricle | 4 | Fourth ventricle, central canal. |
HSS |
10670 | surface structures of hindbrain | brainStem | 7 | Default for hindbrain surface = pons/medulla surface. Cerebellar surface (deeper) overrides. |
hbv |
266441737 | blood vessels of hindbrain | background | 0 | Vasculature. |
HTS |
10663 | transient structures of hindbrain | background | 0 | Transient developmental structures. Reviewable default. |
CB |
10656 | cerebellum | cerebellarGrayMatter | 5 | Cerebellar cortex + deep nuclei (both gray). Deeper than HGM → overrides brainStem. |
CbSS |
12827 | surface structures of cerebellum | cerebellarGrayMatter | 5 | Cerebellar lobes/lobules (gray). Deeper than HSS → overrides brainStem. |
cbf |
12828 | cerebellar fissures | cerebrospinalFluid | 1 | Cerebellar fissures = CSF. Deeper than CbSS → overrides cerebellar gray. |
scp |
12354 | superior cerebellar peduncle | cerebellarWhiteMatter | 6 | Cerebellar WM tract filed under MWM (midbrain). Deeper than MWM → overrides brainStem. |
xscp |
12337 | decussation of superior cerebellar peduncle | cerebellarWhiteMatter | 6 | Same; 0 voxels in this volume but anatomically cerebellar WM. (icp/mcp under HWM inherit 6, no anchor needed.) |
SpC |
12890 | spinal cord | background | 0 | Not a brain tissue class. Reviewable default. |
Default (no anchor on path): background (0). This catches the abstract
container nodes above the GM/WM/V split — neural plate, neural tube, brain,
forebrain, midbrain, hindbrain — which never appear as voxel labels.
| Structure (acronym, path tail) | Anchors hit (leaf→root) | Result |
|---|---|---|
precentral gyrus (PrCG) |
FSS |
gray (2) |
putamen (Pu) |
CN → FGM |
deep gray (8) |
thalamus (Pul) |
THM → FGM |
deep gray (8) |
head of hippocampus (HiH) |
FSS |
gray (2) |
longitudinal fissure (los) |
CeS → FSS |
CSF (1) |
corpus callosum (cc) |
FWM |
white (3) |
third ventricle (3V) |
FV |
ventricle (4) |
cerebellar lobule (under CBLL) |
CbSS → HSS |
cerebellar gray (5) |
cerebellar fissure (under cbf) |
cbf → CbSS → HSS |
CSF (1) |
pons tegmentum (PnTg) |
HGM (via Met/Pn) |
brain stem (7) |
red nucleus (RN, midbrain) |
MGM/MWM |
brain stem (7) |
| spinal cord leaf | SpC |
background (0) |
-
Midbrain + pons + medulla → a single
brainStem(7).map.csvdefines no separate brain-stem gray/white, and lists cerebellar gray/white as the only posterior-fossa subdivisions. So the entire midbrain and the non-cerebellar hindbrain collapse to brain stem, regardless of their internal gray/white organization. (Their ventricles still go toventricle(4), since those branches are siblings of the GM/WM nodes, never nested under them.) -
Gyri are gray matter; sulci/fissures are CSF. The
FSS"surface structures" branch is a macroscopic landmark parcellation containing both the cortical gyri (CeG, gray) and the sulci/fissures (CeS, the CSF-filled subarachnoid clefts). They are split accordingly. This is the one place where a naive "whole branch → one tissue" rule fails, and the deepest-wins override (CeS→ CSF insideFSS→ gray) handles it. -
Deep gray nuclei →
deepGrayMatter(8). The subcortical/diencephalic nuclei — basal ganglia, amygdala, thalamus, hypothalamus, subthalamic nucleus — are siblings of cortex underFGM. The four branch anchorsCN/THM/SubTH/HTH(deeper thanFGM) pull them into class 8, leaving cortex in class 2. Scope is deliberately forebrain only: the midbrain deep nuclei (substantia nigraSN, red nucleusRN) staybrainStem(7) to keep the "midbrain = brain stem" rule intact, and the cerebellar deep nuclei (CbDN) staycerebellarGrayMatter(5). The hippocampus staysgrayMatter(2): the atlas files it under the surface branchFSS(allocortex), not underCN, so the deep-gray anchors never reach it.
The Allen annotation carries no CSF voxels: 106 sulci/fissure structures map
to class 1 in the ontology, but none of them are segmented in
annotation_full.nii.gz. Subarachnoid/sulcal CSF is therefore defined
spatially instead, in apply_classification.py, using the SynthStrip brain
mask:
every voxel that is inside the brain mask (
mask == 1) but left unlabelled by the annotation (classify == 0) becomes CSF (1).
The mask was generated with SynthStrip:
synthstrip -i mni_icbm152_t1_tal_nlin_sym_09b_hires.nii.gz \
-m mni_icbm152_t1_tal_nlin_sym_09b_hires_synthstrip_mask.nii.gzThese are exactly the CSF-filled clefts the parenchymal atlas does not cover. The fill adds 2,811,582 CSF voxels (only 1,060 annotation voxels fall outside the mask). After this step every class 1–8 is populated in the volume.
The atlas has no cerebellar-white-matter node under the cerebellum (CB
divides only into cortex CBC and deep nuclei CbDN, both gray; there is no
arbor-vitae / internal-WM node). The cerebellar white matter is instead recovered
from two places the ontology files outside CB:
-
HWM(white matter of hindbrain) — the bulk. This catch-all's voxels are the cerebellar internal white matter (arbor vitae), becauseCBis filed under the gray branch (HGM) and its internal WM has no dedicated node. A spatial check onannotation_full.nii.gzconfirms it: 100% ofHWMvoxels lie inside the cerebellum bounding box, 85.5% are closer to the cerebellum centroid than to the brainstem, and theHWMcentroid (z,y,x)=(69,152,196) nearly coincides with the cerebellum-gray centroid (71,142,196). SoHWM→ 6. -
The three cerebellar peduncles, filed under midbrain/hindbrain WM:
peduncle acronym id voxels (csv) filed under superior cerebellar peduncle scp12354 2,338 MWM(midbrain WM)inferior cerebellar peduncle icp12741 1,592 HWM(hindbrain WM)middle cerebellar peduncle mcp12768 10,905 HWM(hindbrain WM)decussation of scp xscp12337 0 MWMscp/xscp(underMWM) are explicit anchors → 6 (deeper thanMWM, so deepest-wins overrides brainStem).icp/mcpsimply inheritHWM's 6.
Together these give 143,819 csv voxels / 287,566 in the volume for class 6 (cerebellar gray:white ≈ 3.9:1, anatomically reasonable).
Residual trade-off. HWM is one atomic label in the volume, so the ~14.5% of
its voxels near the pons (transitional pontocerebellar / MCP-entry fibers) also go
to class 6. This is acceptable — they are cerebellar-bound fibers — and far better
than the alternative, which would mislabel the 85.5% cerebellar core as brainstem.
The remaining class 7 (midbrain + pons + medulla gray + their named tracts,
139,773 csv voxels) is the true brainstem.
| value | tissue | structures (all 3,317) | structures (141 voxel-bearing) |
|---|---|---|---|
| 0 | background | 788 | 0 |
| 1 | cerebrospinalFluid | 106 | 0 |
| 2 | grayMatter | 1,198 | 57 |
| 3 | whiteMatter | 140 | 8 |
| 4 | ventricle | 23 | 9 |
| 5 | cerebellarGrayMatter | 86 | 4 |
| 6 | cerebellarWhiteMatter | 91 | 4 |
| 7 | brainStem | 530 | 12 |
| 8 | deepGrayMatter | 355 | 47 |
Only 141 of the 3,317 ontology nodes are annotated=True (carry voxels); the
rest are abstract intermediate nodes. Every one of the 141 voxel-bearing
structures resolves to a real tissue class (1–8) — none fall to background,
which is the primary correctness check. (Class 1 / CSF has 0 voxel-bearing
structures because its voxels come from the brain-mask fill, not the ontology.)
One command builds everything from scratch — download → brain mask → relabel → priors:
./generate_all.shIt downloads the external inputs, runs SynthStrip for the brain mask (skipped if
the mask is already present; override with SYNTHSTRIP=<cmd>), then runs all
four stages below. The Python deps (SimpleITK + numpy, declared in
requirements.txt) are pulled in per stage by uv run --with-requirements, so
no virtual environment is managed by hand. Inputs and mask are reused if present;
the pipeline stages always re-run.
Requirements: curl, unzip, gzip, uv, and SynthStrip (only if the mask is
not already present).
To run the stages by hand instead:
# Stage 0 — download the external inputs (Allen ontology + volume, MNI T1)
./generate_all.sh # (also runs stages 1–4; see above)
# Stage 1 — build the lookup tables (stdlib only)
python3 classify_tissue.py
# Stage 2 — relabel the volume + spatial CSF fill (SimpleITK)
uv run --with-requirements requirements.txt python apply_classification.py
# Stage 3 — probabilistic tissue priors for ANTs Atropos (SimpleITK)
uv run --with-requirements requirements.txt python make_priors.py
# Stage 4 — coarser schemes (4-class, 3-class, cbmerge): volumes + priors
uv run --with-requirements requirements.txt python make_coarse.py- Reads
voxel_count.csvwithcsv.DictReader(thenamecolumn contains commas — a real CSV parser is required). - Resolves the 29 anchor acronyms to ids and builds the anchor→value table.
- Classifies every structure by the deepest-anchor-on-path rule.
- Writes the LUTs, numerically sorted by id:
tissue_map.lut(all nodes),tissue_map_annotated.lut(voxel-bearing nodes), andtissue_map_background.lut(a visualization aid mapping background-class structures → 1, else 0). - Prints an audit: per-class counts, and flags any voxel-bearing structure that fell to background (would indicate a misclassification to review).
- Loads
tissue_map.lut(the all-nodes superset) as the single source of truth — the relabeling logic is not duplicated here. - Reads
annotation_full.nii.gzinto a numpy array. - Sparse vectorized remap. Ids reach ~2.7 × 10⁸, so a dense lookup array is
avoided. Instead:
u = np.unique(arr), build the smallvalsarray for just those ids, thenout = vals[np.searchsorted(u, arr)]. Voxels with id 0 or any id absent from the LUT → background. - Spatial CSF fill. Reads the SynthStrip brain mask (grid asserted to match)
and sets
out[(mask == 1) & (out == 0)] = 1, sourcing the CSF class the annotation does not carry (see "CSF (class 1)" above). - Writes
classify.nii.gzasuint8, callingCopyInformationto preserve spacing, origin and direction. - Runs the verification below.
Converts the hard classify.nii.gz into soft spatial priors:
- For each tissue class 1–8, builds a binary mask and blurs it with a 2 mm FWHM
Gaussian (
sigma = FWHM / (2·√(2·ln2)) = 0.8493 mm). Usessitk.DiscreteGaussian(a non-negative separable kernel, spacing-aware) - Normalizes across classes per voxel:
prior_c = smoothed_c / Σ smoothedwherever the total support exceeds1e-4; elsewhere all priors are 0. The result is a proper per-voxel probability distribution over the 8 classes. - Writes
prior01.nii.gz … prior08.nii.gz(float32, 0–1),priorNN↔ class NN, preserving geometry viaCopyInformation.
The figures below are all the same axial slice, rendered by make_figures.py
(run it after a build: uv run --with-requirements requirements.txt --with matplotlib python make_figures.py).
Stage 0 — inputs: the MNI T1 (the SynthStrip mask is derived from this) and the Allen annotation volume (each colour = a distinct structure id):
Stage 1 builds the text LUTs (tissue_map*.lut); stage 2 applies them to the
annotation and adds the mask-sourced CSF, giving the 9-class segmentation:
Stage 3 — the eight probabilistic priors (smoothed, normalized to 0–1). At this
slice prior06 (cerebellar WM) is nearly empty, as expected:
Stage 4 — the coarser schemes. cbmerge keeps brain stem (brown), cerebellum
(purple) and deep gray (pink) distinct; 4-class/3-class fold further:
| File | Format | Contents |
|---|---|---|
tissue_map.lut |
id value, space-separated, 3,317 rows |
All ontology nodes → tissue value. |
tissue_map_annotated.lut |
id value, space-separated, 141 rows |
Only the voxel-bearing structures. |
classify.nii.gz |
NIfTI, uint8, values 0–8 |
The tissue segmentation (same grid as the input volume), incl. mask-sourced CSF. |
prior01.nii.gz … prior08.nii.gz |
NIfTI, float32, 0–1 |
Per-class probabilistic priors (priorNN ↔ class NN), Atropos-ready. |
classify_tissue.py also writes tissue_map_background.lut (a visualization aid:
background-class structures → 1, else 0). The coarser schemes add
classify_<tag>.nii.gz and prior_<tag>_0N.nii.gz — see
Coarser schemes.
Example LUT rows:
10329 2
12369 4
12384 5
Four independent checks, all passing:
-
No voxel-bearing structure falls to background. All 141
annotated=Truestructures resolve to a tissue class 1–8. -
The output volume contains only valid tissue values.
np.unique(classify.nii.gz) == [0, 1, 2, 3, 4, 5, 6, 7, 8]— every class 1–8 is populated (class 1 via the brain-mask fill, the rest from the annotation). -
Per-structure voxel-count cross-check against
voxel_count.csv. Aggregating the csvvoxel_countcolumn by tissue value and comparing to the actual voxel counts inclassify.nii.gz:class volume voxels csv voxels ratio 2 grayMatter 7144564 3572282 2.000 3 whiteMatter 4393397 2207402 1.990 4 ventricle 233697 120733 1.936 5 cerebellarGrayMatter 1115998 557999 2.000 6 cerebellarWhiteMatter 287566 143819 1.999 7 brainStem 275416 139773 1.970 8 deepGrayMatter 512772 257193 1.994Every leaf structure matches at exactly 2.000× — the volume is at twice the count-scale of
voxel_count.csv(the csv counts are at half scale). The ratio being a single shared constant across all classes is the real proof of mapping consistency: a relabeling bug would skew classes by different amounts. The small dips below 2.0 for thin structures (ventricle 1.936) are half-scale partial-volume effects on the csv counts, not mapping errors. -
Priors are a valid probability distribution and recover the hard labels. Each
prior0N.nii.gzlies in[0, 1]; the per-voxel sum is1.000000inside the support and0outside. Takingargmaxover the 8 priors reproduces the hardclassify.nii.gzlabel at 97.2% of supported labelled voxels and 99.98% of strong-prior interiors (>0.9) — the ~2.8% gap is expected boundary softening from the 2 mm-FWHM Gaussian, not a labeling error.
make_coarse.py derives reduced label sets by collapsing the 8 tissue
classes of classify.nii.gz (no second ontology walk — the 9-class result is
the source of truth). It regenerates, for each scheme, the LUTs, the relabeled
volume, and the priors, reusing the identical 2 mm-FWHM DiscreteGaussian
smoothing + per-voxel normalization as make_priors.py.
How each 9-class label folds in (brain stem → white matter by project decision; ventricles → CSF; cerebellar GM/WM merge into GM/WM):
| 9-class source | → 4-class (CSF/GM/WM/DeepGM) | → 3-class (CSF/GM/WM) |
|---|---|---|
| 1 cerebrospinalFluid | 1 CSF | 1 CSF |
| 4 ventricle | 1 CSF | 1 CSF |
| 2 grayMatter | 2 GM | 2 GM |
| 5 cerebellarGrayMatter | 2 GM | 2 GM |
| 8 deepGrayMatter | 4 DeepGM | 2 GM |
| 3 whiteMatter | 3 WM | 3 WM |
| 6 cerebellarWhiteMatter | 3 WM | 3 WM |
| 7 brainStem | 3 WM | 3 WM |
A third scheme, cbmerge (7 classes), keeps the full 9-class granularity but
merges cerebellar gray (5) + cerebellar white (6) into a single cerebellum
label, renumbered contiguous: 1 CSF, 2 GM, 3 WM, 4 ventricle, 5 cerebellum,
6 brainStem, 7 deepGM. (All other classes unchanged; nothing else folds.)
All schemes verify the same way as the 9-class priors: labels in range,
per-voxel prior sum 1.000000 in support / 0 outside → CONSISTENT. (For the
4-/3-class schemes, CSF and WM voxel coverage is identical; only GM vs DeepGM
differs, via per-scheme renormalization.)
Outputs per scheme <tag> ∈ {4class, 3class, cbmerge}: map_<tag>.csv,
tissue_map_<tag>.lut, tissue_map_<tag>_annotated.lut, classify_<tag>.nii.gz,
and prior_<tag>_0N.nii.gz (Atropos-ready as -p prior_<tag>_%02d.nii.gz).
- The entire mapping is defined by the
ANCHORSdict at the top ofclassify_tissue.py. To change a decision, edit one line and rerun both scripts. - Reviewable defaults, called out explicitly so they are easy to revisit:
- transient developmental structures (
FTS,HTS) → background (0) - spinal cord (
SpC) → background (0) - blood vessels (
fbv,mbv,hbv) → background (0) - sulci/fissures (
CeS,cbf) → CSF (1) - bare container nodes (neural plate/tube, brain, forebrain/mid/hindbrain) → background (0)
- transient developmental structures (
- Because anchors are keyed by acronym and resolved at runtime (with a warning on any unresolved acronym), the pipeline is robust to id renumbering across atlas versions.
voxel_count.csv input — Allen ontology (3,317 structures)
map.csv input — 9 tissue classes
annotation_full.nii.gz input — label volume (Allen ids per voxel)
mni_icbm152_..._synthstrip_mask.nii.gz input — brain mask (CSF source)
generate_all.sh full build: download inputs + mask + run pipeline
requirements.txt python deps (SimpleITK, numpy) for `uv run`
classify_tissue.py builds the LUTs from the ontology hierarchy
apply_classification.py relabels the volume + mask CSF fill via SimpleITK
make_priors.py builds the 8 probabilistic priors via SimpleITK
make_coarse.py derives the 4-class/3-class/cbmerge schemes (LUT/vol/priors)
make_figures.py renders the images/ example-slice figures (matplotlib)
images/ example-slice PNGs embedded in this README
tissue_map.lut output — all 3,317 nodes → tissue value
tissue_map_annotated.lut output — 141 voxel-bearing → tissue value
tissue_map_background.lut output — viz aid: background-class structures → 1
classify.nii.gz output — 9-class tissue segmentation (uint8)
prior01..08.nii.gz output — per-class probabilistic priors (float32)
map_<tag>.csv output — coarse scheme defs (tag: 4class/3class/cbmerge)
tissue_map_<tag>[_annotated].lut output — coarse id → tissue LUTs
classify_<tag>.nii.gz output — coarse tissue segmentations (uint8)
prior_<tag>_0N.nii.gz output — coarse per-class priors (float32)



