Skip to content

Commit 83ff1aa

Browse files
committed
added finding missing section function
1 parent 3370bfa commit 83ff1aa

2 files changed

Lines changed: 134 additions & 37 deletions

File tree

README.md

Lines changed: 102 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,111 @@
1-
# xenium_analysis_tools
1+
```markdown
2+
# Xenium Analysis Tools
23

3-
[![License](https://img.shields.io/badge/license-MIT-brightgreen)](LICENSE)
4-
![Code Style](https://img.shields.io/badge/code%20style-black-black)
5-
[![semantic-release: angular](https://img.shields.io/badge/semantic--release-angular-e10079?logo=semantic-release)](https://github.com/semantic-release/semantic-release)
6-
![Interrogate](https://img.shields.io/badge/interrogate-100.0%25-brightgreen)
7-
![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen)
8-
![Python](https://img.shields.io/badge/python->=3.10-blue?logo=python)
9-
10-
## Usage
11-
- To use this template, click the green `Use this template` button and `Create new repository`.
12-
- After github initially creates the new repository, please wait an extra minute for the initialization scripts to finish organizing the repo.
13-
- To enable the automatic semantic version increments: in the repository go to `Settings` and `Collaborators and teams`. Click the green `Add people` button. Add `svc-aindscicomp` as an admin. Modify the file in `.github/workflows/tag_and_publish.yml` and remove the if statement in line 65. The semantic version will now be incremented every time a code is committed into the main branch.
14-
- To publish to PyPI, enable semantic versioning and uncomment the publish block in `.github/workflows/tag_and_publish.yml`. The code will now be published to PyPI every time the code is committed into the main branch.
15-
- The `.github/workflows/test_and_lint.yml` file will run automated tests and style checks every time a Pull Request is opened. If the checks are undesired, the `test_and_lint.yml` can be deleted. The strictness of the code coverage level, etc., can be modified by altering the configurations in the `pyproject.toml` file and the `.flake8` file.
16-
- Please make any necessary updates to the README.md and CITATION.cff files
17-
18-
## Level of Support
19-
Please indicate a level of support:
20-
- [ ] Supported: We are releasing this code to the public as a tool we expect others to use. Issues are welcomed, and we expect to address them promptly; pull requests will be vetted by our staff before inclusion.
21-
- [ ] Occasional updates: We are planning on occasional updating this tool with no fixed schedule. Community involvement is encouraged through both issues and pull requests.
22-
- [ ] Unsupported: We are not currently supporting this code, but simply releasing it to the community AS IS but are not able to provide any guarantees of support. The community is welcome to submit issues, but you should not expect an active response.
23-
24-
## Release Status
25-
GitHub's tags and Release features can be used to indicate a Release status.
26-
27-
- Stable: v1.0.0 and above. Ready for production.
28-
- Beta: v0.x.x or indicated in the tag. Ready for beta testers and early adopters.
29-
- Alpha: v0.x.x or indicated in the tag. Still in early development.
4+
A Python library for processing and mapping Xenium spatial data, developed by the Allen Institute for Neural Dynamics.
305

316
## Installation
32-
To use the software, in the root directory, run
33-
```bash
34-
pip install -e .
7+
8+
### Code Ocean Package Manager (Recommended)
9+
This library can be installed directly via the Code Ocean environment manager.
10+
11+
1. Open your Capsule.
12+
2. Go to the **Environment** tab.
13+
3. In the **Pip** section, click **Add**.
14+
4. Paste the following link:
15+
```text
16+
git+[https://github.com/AllenInstitute/xenium_analysis_tools#egg=xenium-analysis-tools](https://github.com/AllenInstitute/xenium_analysis_tools#egg=xenium-analysis-tools)
17+
3518
```
3619

37-
To develop the code, run
20+
5. Click **Launch Cloud Workstation** to build.
21+
22+
### Local Installation
23+
24+
To install locally or in a standard terminal:
25+
3826
```bash
39-
pip install -e . --group dev
27+
pip install git+[https://github.com/AllenInstitute/xenium_analysis_tools.git](https://github.com/AllenInstitute/xenium_analysis_tools.git)
28+
4029
```
41-
Note: --group flag is available only in pip versions >=25.1
4230

43-
Alternatively, if using `uv`, run
44-
```bash
45-
uv sync
31+
---
32+
33+
## Modules
34+
35+
The library is organized into three primary sub-packages designed to handle different stages of the Xenium analysis pipeline.
36+
37+
### 1. `process_xenium`
38+
39+
Tools for processing raw Xenium outputs, managing SpatialData objects, and preparing data for downstream analysis.
40+
41+
* **`process_dataset_slides`**: Main workflow for processing slides across an entire dataset.
42+
* **`process_spatialdata`**: Core logic for manipulating and formatting Xenium `SpatialData` objects.
43+
* **`divide_sections`**: Utilities for handling section boundaries and splitting data.
44+
* **`validate_sections`**: Quality control checks to ensure section integrity before processing.
45+
* **`generate_dataset_slides`**: Helper functions for creating slide-level representations.
46+
47+
### 2. `map_xenium`
48+
49+
Functions for mapping cell types to Xenium data using reference taxonomies.
50+
51+
* **`map_sections`**: Logic for mapping cell types on individual tissue sections.
52+
* **`map_dataset_sections`**: Batch processing tools to apply mapping across multiple sections in a dataset.
53+
54+
### 3. `utils`
55+
56+
Shared utility functions used across the library.
57+
58+
* **`io_utils`**: Standardized functions for loading and saving Xenium data structures.
59+
60+
---
61+
62+
## Usage
63+
64+
Import the specific modules you need for your analysis workflow.
65+
66+
**Example: Processing a Dataset**
67+
68+
```python
69+
from xenium_analysis_tools.process_xenium import process_dataset_slides
70+
from xenium_analysis_tools.utils import io_utils
71+
72+
# Load your configuration or data path
73+
data_path = "/path/to/xenium/data"
74+
75+
# Run the processing pipeline
76+
process_dataset_slides.run(data_path)
77+
78+
```
79+
80+
**Example: Mapping Sections**
81+
82+
```python
83+
from xenium_analysis_tools.map_xenium import map_dataset_sections
84+
85+
# Run cell type mapping on processed sections
86+
map_dataset_sections.run_mapping(
87+
processed_data_path="/path/to/processed/data",
88+
taxonomy_ref="/path/to/taxonomy"
89+
)
90+
4691
```
92+
93+
---
94+
95+
## Development
96+
97+
### Updating the Package
98+
99+
1. Make changes to the code in the `src/` directory.
100+
2. Bump the version in `src/xenium_analysis_tools/__init__.py`.
101+
3. Commit and push to GitHub.
102+
4. Create and push a new tag matching the version (e.g., `v0.1.1`).
103+
104+
### Running Tests
105+
106+
This project uses `pytest`. Run the following in the root directory:
107+
108+
```bash
109+
pytest tests/
110+
111+
```

src/xenium_analysis_tools/process_xenium/generate_dataset_slides.py

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,28 @@
1515
)
1616
from xenium_analysis_tools.process_xenium.process_spatialdata import read_xenium_slide
1717

18+
def find_xenium_bundle(bundle_name, data_folder='/root/capsule/data'):
19+
data_folder = Path(data_folder)
20+
search_paths = [
21+
data_folder / 'xenium_data',
22+
data_folder / 'Xenium_output_pilot'
23+
]
24+
search_paths = [path for path in search_paths if path.exists()]
25+
all_dirs = np.concatenate([list(folder.iterdir()) for folder in search_paths])
26+
output_folders = np.concatenate([list(folder.glob('output-*')) for folder in search_paths])
27+
subfolders = np.setdiff1d(all_dirs, output_folders)
28+
path_to_bundle = None
29+
found_dirs = [dir for dir in output_folders if dir.name == bundle_name]
30+
if found_dirs:
31+
path_to_bundle = found_dirs[0]
32+
else:
33+
for sub in subfolders:
34+
found_dirs = [dir for dir in list(sub.iterdir()) if dir.name == bundle_name]
35+
if found_dirs:
36+
path_to_bundle = found_dirs[0]
37+
break
38+
return path_to_bundle
39+
1840
def generate_slides(dataset_name: str, config_path: str=None, select_sections: list[int]|None = None):
1941
"""
2042
Generate slide-level SpatialData objects from raw Xenium data bundles.
@@ -55,6 +77,16 @@ def generate_slides(dataset_name: str, config_path: str=None, select_sections: l
5577
logger.info(f"Slide {slide_id} already processed. Skipping.")
5678
continue
5779
logger.info(f"Generating SpatialData object for slide {slide_id}...")
80+
if not (raw_slide_path / 'experiment.xenium').exists():
81+
logger.info(f"Experiment file not found for slide {slide_id} at {raw_slide_path / 'experiment.xenium'}")
82+
logger.info(f"Looking for alternative experiment file...")
83+
path_to_bundle = find_xenium_bundle(slide_row['dir'], data_folder=paths['data_root'])
84+
if path_to_bundle is not None:
85+
logger.info(f"Found alternative experiment file in {path_to_bundle.parent}")
86+
raw_slide_path = path_to_bundle
87+
else:
88+
logger.error(f"Could not find experiment file for slide {slide_id}. Skipping.")
89+
continue
5890
logger.info(f"Reading Xenium bundle: {raw_slide_path}")
5991
sdata_reader_params = config.get('sdata_reader_params', {})
6092
if sdata_reader_params.get('n_jobs') == "max": sdata_reader_params['n_jobs'] = os.cpu_count()

0 commit comments

Comments
 (0)