You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
parser=argparse.ArgumentParser(description="This script handles downloading, processing and formatting of sample files for the Novartis PDX data into a single samplesheet")
parser.add_argument("-p", '--prevSamples', nargs="?", type=str, default="", const="", help="Use this to provide previous sample file, will run sample file generation")
Download omics data from Synapse at synapseID syn66364488. Requires a synapse token, which requires you to make a Synapse account
37
+
and create a Personal Access Token. More information here: https://help.synapse.org/docs/Managing-Your-Account.2055405596.html#ManagingYourAccount-PersonalAccessTokens
38
+
Omics data is an excel file. The excel file is then parsed for the RNAseq, copy number, and mutations data.
39
+
40
+
Parameters
41
+
----------
42
+
synID : string
43
+
SynapseID of dataset to download. Default is synapseID of the sequencing dataset.
44
+
45
+
save_path : string
46
+
Local path where the downloaded file will be saved.
47
+
48
+
synToken : string
49
+
Synapse Personal Access Token of user. Requires a Synapse account. More information at: https://help.synapse.org/docs/Managing-Your-Account.2055405596.html#ManagingYourAccount-PersonalAccessTokens
Maps copy number data to improved sample id's and entrez gene data. Also does some data formatting.
83
+
84
+
Parameters
85
+
----------
86
+
copy_number_data : pd.Dataframe OR string
87
+
Pandas dataframe object with copy number data OR path to csv with copy number data
88
+
89
+
improve_id_data : pd.Dataframe OR string
90
+
Pandas dataframe object with improve id data OR path to csv with improve id data. This is one of the outputs of parse_mmc2()
91
+
92
+
entrez_data : pd.Dataframe OR string
93
+
Pandas dataframe object with entrez gene data OR path to csv with entrez gene data. Use this code to get this file: https://github.com/PNNL-CompBio/coderdata/tree/e65634b99d060136190ec5fba0b7798f8d140dfb/build/genes
94
+
95
+
Returns
96
+
-------
97
+
sample_entrez_cn_df : pd.DataFrame
98
+
A DataFrame containing the mapped copy number data with columns: entrez_id, copy_number, copy_call, study, source ,improve_sample_id
Maps transcriptomics data to improved sample id's and entrez gene data. Also does some data formatting.
140
+
141
+
Parameters
142
+
----------
143
+
copy_number_data : pd.Dataframe OR string
144
+
Pandas dataframe object with transcriptomics data OR path to csv with transcriptomics data
145
+
146
+
improve_id_data : pd.Dataframe OR string
147
+
Pandas dataframe object with improve id data OR path to csv with improve id data. This is one of the outputs of parse_mmc2()
148
+
149
+
entrez_data : pd.Dataframe OR string
150
+
Pandas dataframe object with entrez gene data OR path to csv with entrez gene data. Use this code to get this file: https://github.com/PNNL-CompBio/coderdata/tree/e65634b99d060136190ec5fba0b7798f8d140dfb/build/genes
151
+
152
+
Returns
153
+
-------
154
+
sample_entrez_cn_df : pd.DataFrame
155
+
A DataFrame containing the mapped transcriptomics data with columns: entrez_id, copy_number, copy_call, study, source ,improve_sample_id
parser=argparse.ArgumentParser(description="This script handles downloading, processing and formatting of omics data files for the Bladder PDO project")
200
+
parser.add_argument('-s', '--samples', help='Path to sample file',default=None)
201
+
parser.add_argument('-g', '--genes', help='Path to genes file', default=None)
202
+
parser.add_argument('-c', '--copy', help='Flag to capture copy number data', action='store_true', default=False)
203
+
parser.add_argument('-m', '--mutation', help='Flag to capture mutation data', action='store_true', default=False)
204
+
parser.add_argument('-e', '--expression', help='Flag to capture transcriptomic data', action='store_true', default=False)
0 commit comments