biofx.annotator package¶
Submodules¶
biofx.annotator.GSCannotation module¶
Module to retrieve annotations from GSC resources. @cchng
Created in January 2015.
-
class
biofx.annotator.GSCannotation.
Ensembl
(resource='/projects/yshen_prj/db/Ensembl_69_transcripts.txt')[source]¶ Bases:
object
Legacy resource. Avoid if possible.
-
class
biofx.annotator.GSCannotation.
Hugo
(resource='/projects/yshen_prj/db/hugo_genenames_08Mar2012.txt')[source]¶ Bases:
object
Legacy resource. Avoid if possible.
-
class
biofx.annotator.GSCannotation.
WTSSGeneExon
(ensg2genesymbol, resource='/projects/wtsspipeline/resources/Homo_sapiens/bfa_NCBI-37-TCGA/transcript_coverage/ens69_mito_as_MT_no_LRG_genes/gene_exon.txt')[source]¶ Bases:
object
Information from WTSS gene_exon.txt file
-
class
biofx.annotator.GSCannotation.
WTSSGeneInfo
(resource='/projects/wtsspipeline/resources/Homo_sapiens/bfa_NCBI-37-TCGA/transcript_coverage/ens69_mito_as_MT_no_LRG_genes/gene_info.txt')[source]¶ Bases:
object
Parse and retrieve information from WTSS gene_info.txt file
Parameters: resource (string) – path to gene_info.txt file in WTSS formatting Raises: IOError
– resource file provided does not exist-
get_ensg2symbol
()[source]¶ Returns: a mapping with ensembl gene ID as keys and gene symbol as values Return type: dict Raises: RuntimeError
– Something went wrong with populating mappings
-
get_ensg_from_enst
(enst)[source]¶ Get Ensembl gene ID from Ensembl transcript ID.
Parameters: enst (string) – Ensembl transcript ID Returns: Ensembl gene ID. NA if Ensembl transcript ID not found Return type: string
-
get_ensg_from_symbol
(symbol)[source]¶ Get Ensembl gene ID from gene symbol. One to many mapping.
Parameters: symbol (string) – Gene symbol Returns: Ensembl gene ID. comma-delimited if multiple. “NA” symbol not found. Return type: string
-
get_enst2ensg
()[source]¶ Returns: a mapping with ensembl transcript ID as keys and ensembl gene ID as values Return type: dict Raises: RuntimeError
– Something went wrong with populating mappings
-
get_gene_symbol
(ensg)[source]¶ Get gene symbol from Ensembl gene ID
Parameters: ensg (string) – Ensembl gene ID Returns: gene symbol. NA if gene symbol not found Return type: (string)
-
get_strand_from_gene_symbol
(gene_symbol)[source]¶ Parameters: gene_symbol (string) – gene symbol Returns: strand. NA if gene symbol not found Return type: string
-
get_symbol2ensg
()[source]¶ Returns: a mapping with gene symbol as keys and ensembl gene ID as values Return type: dict Raises: RuntimeError
– Something went wrong with populating mappings
-
biofx.annotator.VariantAnnotation module¶
@cchng
-
class
biofx.annotator.VariantAnnotation.
SnpEff
(java_mem='Xmx4g', version=None)[source]¶ Bases:
object
A wrapper for snpeff/snpsift 4.1 excecution.
-
annotate_with_snpeff
(input_file, output_file, genome, snpeff=None)[source]¶ Annotation input with snpeff. Runs classic and hgvs annotations concurrently.
Parameters: - input_file (string) – input vcf file path
- output_file (string) – output vcf file path. hgvs output has an hgvs suffix appended
- genome (string) – genome used for snpeff
- snpeff (string) – snpeff executable
Returns: list of return values
Return type: revals (list)
-
annotate_with_snpsift
(input_file, output_file, annotate, snpsift=None)[source]¶ Annotate input with snpsift.
Parameters: - input_file (string) – input vcf file path
- output_file (string) – output vcf file path
- annotate (string) – vcf file used for snpsift annotation
- snpsift (string) – snpsift executable
Returns: return value
Return type: reval (int)
-