Skip to content

Commit a3e6dd7

Browse files
author
Rachel Shekar
authored
Create 2016_08_01_genomics_standards.md
1 parent 4961751 commit a3e6dd7

1 file changed

Lines changed: 93 additions & 0 deletions

File tree

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
---
2+
layout: post
3+
title: "August 2016 Genomic Standards Committee Meeting Notes"
4+
modified:
5+
categories: blog
6+
excerpt:
7+
tags: []
8+
image:
9+
feature:
10+
date: 2016-08-01T20:43:38-05:00
11+
---
12+
13+
#
14+
# TERRA Ref Genomics Standards Committee Meeting
15+
16+
## **Participants**
17+
18+
David LeBauer, Christine Laney, Michael Gore, Carolyn Lawrence-Dill, Eric Lyons, Noah Fahlgren
19+
20+
REGRETS:
21+
Todd Mockler, Max Burnette, David Lee, Geoff Morris, Craig Willis
22+
23+
## **Agenda**
24+
25+
Introductions
26+
27+
Objective: review current status of pipeline and plans for first data release in November.
28+
29+
Overview (Noah)
30+
31+
Sequencing
32+
33+
- what has been done
34+
- 192 resequenced genomes (~20-30x coverage each) from Steve K. bioenergy assoc. panel (BAP)
35+
- 192 additional samples sent to HudsonAlpha one week ago (20-30x)
36+
- External funding
37+
- Illumina for additional ~1000 sequences
38+
- DOE CSP for de novo
39+
- Data quality control and analysis to date done on the Danforth Center cluster
40+
- Trimmomatic => bwa => GATK => CNVator
41+
- By November: user will upload raw sequencing data and metadata to TERRAref pipeline using CoGe (below)
42+
- what is in pipeline
43+
- Raw data and experimental metadata added to Clowder
44+
- Clowder extractor
45+
- Upload data to the CyVerse data store (TERRA-REF)
46+
- Launch CoGe workflow using the API
47+
- Synchronize results back to Clowder/BETYdb
48+
49+
- Clowder: a database that can hold data of any format. Data being imported to clowder will automatically trip extractor that will move data to the correct location for discovery and analysis
50+
- Data will be uploaded to NCBI, SRA
51+
- Can we link from the SRA to CyVerse and Clowder easily and robustly?
52+
53+
CoGe pipeline
54+
55+
- A sample analysis: [https://genomevolution.org/coge/NotebookView.pl?nid=1344](https://genomevolution.org/coge/NotebookView.pl?nid=1344)
56+
- Draft implementation: [https://github.com/terraref/computing-pipeline/blob/f94a87f851b37ff74ded5b7b6b3b0c1e13107720/scripts/coge/coge\_upload.json](https://github.com/terraref/computing-pipeline/blob/f94a87f851b37ff74ded5b7b6b3b0c1e13107720/scripts/coge/coge_upload.json)
57+
58+
Downstream Analyses
59+
60+
- GOBII
61+
- Other downstream tools?
62+
- SNP callling via CoGe
63+
- What is already within CoGe
64+
- Putting proprietary GATK on CyVerse (Mike G will send more info)
65+
66+
Data Sharing
67+
68+
- when, where, and with what will we share as of November
69+
- Currently using CyVerse data store ( [https://de.iplantcollaborative.org/de/](https://de.iplantcollaborative.org/de/))
70+
- [terraref/reference-data/19](https://github.com/terraref/reference-data/issues/19)
71+
- Phytozome (a DOE database)- is this an appropriate for our data? Perhaps not for raw reads (Mike G)
72+
- Maybe we can submit variation information from the CoGe pipeline and update it as the reference genome is updated
73+
- Is Phytozome interested in hosting a pangenome resources?
74+
- NCBI SRA: raw data + experimental metadata
75+
- NEON has worked with SRA on data/metadata sharing, keep in touch with them
76+
- Others?
77+
78+
Other questions / ideas
79+
80+
- How to get from genbank to related
81+
82+
NEON: providing metagenomic data, processed and made available to the public w/ mgrast; marker gene sequences will be hosted in SRA / not available w/in NEON portal but available from external repository. Genomic standard meeting next week, working on environmental soil meta-data package for Mixs [http://gensc.org/mixs/submit-mixs-metadata/](http://gensc.org/mixs/submit-mixs-metadata/)
83+
84+
NEON has started using EML to begin documenting sensor and observational data (currently online at [http://data.neonscience.org](http://data.neonscience.org) but not pretty). May begin doing this w/ soil samples.
85+
86+
Action items:
87+
88+
### **References**
89+
90+
- Genomics pipeline documentaiton [https://github.com/terraref/documentation/blob/master/genomics\_pipeline.md](https://github.com/terraref/documentation/blob/master/genomics_pipeline.md)
91+
- Genomics data formats: [terraref/reference-data/19](https://github.com/terraref/reference-data/issues/19)
92+
- Pipeline implementation: [terraref/computing-pipeline/issues/37](https://github.com/terraref/computing-pipeline/issues/37)
93+
- Using CoGe [terraref/computing-pipeline/issues/41](https://github.com/terraref/computing-pipeline/issues/41)

0 commit comments

Comments
 (0)