Skip to content

ot710/smoking_striatum_iron

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

112 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Smoking and Striatal Iron: Phenotypic and Genetic Analyses

This repository contains the code used to generate the results reported in "Bidirectional genetic and phenotypic links between smoking and striatal iron content involving dopaminergic and inflammatory pathways" (Addiction, 2026) investigating phenotypic, genetic, and causal relationships between tobacco smoking and MRI-derived markers of iron content in the striatum using UK Biobank data.

Project Overview

The aim of this project is to examine the relationship between smoking behaviour (smoking initiation, smoking status, pack-years, and years since cessation) and brain iron accumulation in striatal regions implicated in reward processing (putamen, caudate, and nucleus accumbens). To this end, the project integrates:

  • Phenotypic association analyses using individual-level UK Biobank data
  • Genome-wide association study (GWAS) summary statistics
  • Genetic correlation analyses
  • Cross-GWAS gene-wise coherence and causal inference analyses
  • Mendelian randomisation (MR)

The analyses combine MRI-derived markers of iron content (T2* and quantitative susceptibility mapping, QSM) with smoking-related phenotypes to investigate shared genetic architecture and potential bidirectional causal mechanisms.


Repository Structure

The repository is organised into the following main directories:

.
├── gwas/
├── phenotypic/
├── ldsc/
├── pascalx/
└── MR/

gwas/ — GWAS Summary Statistics

Download and format GWAS summary statistics used across downstream genetic analyses.

  • Download GWAS summary statistics from public repositories
    (download_gwas_stats.sh)
  • Merge UK Biobank brain GWAS summary statistics from discovery and replication samples using an inverse-variance weighted estimator
    (gwas_qsm_stats_merge.R)
  • Generate analysis-ready GWAS files compatible with LD Score Regression
    (gwas_stats_format_for_ldsc.py)

phenotypic/ — Phenotypic Analyses

Phenotypic association analyses between smoking-related measures and striatal iron markers using individual-level UK Biobank data.

  • Custom dataset with variables of interest
    (dataset_creation.py)
  • Data preprocessing
    (data_preprocessing.py)
  • Linear regression models
    (linear_models.py)
  • Correlation analyses
    (correlations.py)
  • Visualisation and plotting
    (plot_phen_gen_corr.py)
  • Robustness metrics for sensitivity analyses
    (robustness_metrics.py)

ldsc/ — Genetic Correlation (LD Score Regression)

Genetic correlations between smoking phenotypes and striatal iron traits using Linkage Disequilibrium Score Regression.

  • Cross-trait genetic correlation analyses
    (ldsc_gcorr.sh)
  • False discovery rate correction
    (ldsc_fdr.py)

pascalx/ — Cross-GWAS Gene-wise Coherence and Causality

Cross-GWAS gene-wise coherence testing and causal relationship analyses using PascalX.

  • Cross-trait coherence test
    (xscorer.py)
  • Cross-trait ratio test (causal analysis)
    (xscorer_ratio.py)
  • Short script to create anti-coherence result file from coherence file ($p_{anticoherence} = 1 - p_{coherence}$) to save time
    (make_anticohe_from_cohe.py)
  • Visualisation and plotting
    (gene_tables_heatmaps.py)

Supporting files:

  • xscorer_config.py: PascalX configuration for xscorer.py and xscorer_ratio.py
  • cluster_loc.csv: names and locations of genes that are part of a cluster
  • confounder_genes.csv: genes previously associated with possible confounders (weekly alcohol consumption and serum iron)
  • genes2remove.csv: non-coding genes and duplicates with different names

MR/ — Mendelian Randomisation

Mendelian randomisation analyses assessing potential causal relationships between smoking and striatal iron measures.

  • MR forward
    (MR_forward.R)
  • MR reverse
    (MR_reverse.R)
  • Sensitivity analysis
    (MR_sensitivity.R)
  • Visualisation and plotting
    (MR_figure.py)

Supporting files:

  • MR_functions.R: Functions for MR and sensitivity analysis
  • config_forward.R: Configuration for MR forward
  • config_reverse.R: Configuration for MR reverse

Data Availability

  • UK Biobank data: Access requires an approved UK Biobank application. Individual-level data are not included in this repository.
  • GWAS summary statistics: Publicly available sources were used (see scripts and comments in gwas/ for details).

Reproducibility Notes

Analyses were conducted using a combination of Python, R, and external genetic analysis tools.

Core Python and R package requirements are listed in requirements.txt and include:

  • Python 3.7 with standard scientific computing libraries
  • R 4.2.2 with the TwoSampleMR package (v0.5.7)

The following external tools were used and must be installed separately:

  • LD Score Regression v1.0.1 (Python 2.7)
  • PascalX v0.0.3 (Python 3.8.19)

Due to data access restrictions, full reproduction of results requires authorised access to UK Biobank data.


Contact

For questions regarding the code or analyses, please contact the corresponding author of the paper.

About

Phenotypic and genetic association between smoking and striatum iron

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors