Nucleic acids research, 2019
Authors
Haile, Simon, Corbett, Richard D, Bilobram, Steve, Bye, Morgan H, Kirk, Heather, Pandoh, Pawan, Trinh, Eva, MacLeod, Tina, McDonald, Helen, Bala, Miruna, Miller, Diane, Novik, Karen, Coope, Robin J, Moore, Richard A, Zhao, Yongjun, Mungall, Andrew J, Ma, Yussanne, Holt, Rob A, Jones, Steven J, Marra, Marco A
Publication Abstract
Tissues used in pathology laboratories are typically stored in the form of formalin-fixed, paraffin-embedded (FFPE) samples. One important consideration in repurposing FFPE material for next generation sequencing (NGS) analysis is the sequencing artifacts that can arise from the significant damage to nucleic acids due to treatment with formalin, storage at room temperature and extraction. One such class of artifacts consists of chimeric reads that appear to be derived from non-contiguous portions of the genome. Here, we show that a major proportion of such chimeric reads align to both the 'Watson' and 'Crick' strands of the reference genome. We refer to these as strand-split artifact reads (SSARs). This study provides a conceptual framework for the mechanistic basis of the genesis of SSARs and other chimeric artifacts along with supporting experimental evidence, which have led to approaches to reduce the levels of such artifacts. We demonstrate that one of these approaches, involving S1 nuclease-mediated removal of single-stranded fragments and overhangs, also reduces sequence bias, base error rates, and false positive detection of copy number and single nucleotide variants. Finally, we describe an analytical approach for quantifying SSARs from NGS data.

PloS one, 2019
Authors
Haile, Simon, Corbett, Richard D, Bilobram, Steve, Mungall, Karen, Grande, Bruno M, Kirk, Heather, Pandoh, Pawan, MacLeod, Tina, McDonald, Helen, Bala, Miruna, Coope, Robin J, Moore, Richard A, Mungall, Andrew J, Zhao, Yongjun, Morin, Ryan D, Jones, Steven J, Marra, Marco A
Publication Abstract
Next generation RNA-sequencing (RNA-seq) is a flexible approach that can be applied to a range of applications including global quantification of transcript expression, the characterization of RNA structure such as splicing patterns and profiling of expressed mutations. Many RNA-seq protocols require up to microgram levels of total RNA input amounts to generate high quality data, and thus remain impractical for the limited starting material amounts typically obtained from rare cell populations, such as those from early developmental stages or from laser micro-dissected clinical samples. Here, we present an assessment of the contemporary ribosomal RNA depletion-based protocols, and identify those that are suitable for inputs as low as 1-10 ng of intact total RNA and 100-500 ng of partially degraded RNA from formalin-fixed paraffin-embedded tissues.

Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 2019
Authors
Ennishi, Daisuke, Jiang, Aixiang, Boyle, Merrill, Collinge, Brett, Grande, Bruno M, Ben-Neriah, Susana, Rushton, Christopher, Tang, Jeffrey, Thomas, Nicole, Slack, Graham W, Farinha, Pedro, Takata, Katsuyoshi, Miyata-Takata, Tomoko, Craig, Jeffrey, Mottok, Anja, Meissner, Barbara, Saberi, Saeed, Bashashati, Ali, Villa, Diego, Savage, Kerry J, Sehn, Laurie H, Kridel, Robert, Mungall, Andrew J, Marra, Marco A, Shah, Sohrab P, Steidl, Christian, Connors, Joseph M, Gascoyne, Randy D, Morin, Ryan D, Scott, David W
Publication Abstract
High-grade B-cell lymphoma with MYC and BCL2 and/or BCL6 rearrangements (HGBL-DH/TH) has a poor outcome after standard chemoimmunotherapy. We sought to understand the biologic underpinnings of HGBL-DH/TH with BCL2 rearrangements (HGBL-DH/TH- BCL2) and diffuse large B-cell lymphoma (DLBCL) morphology through examination of gene expression.

PloS one, 2019
Authors
Schuetz, Johanna M, Grundy, Anne, Lee, Derrick G, Lai, Agnes S, Kobayashi, Lindsay C, Richardson, Harriet, Long, Jirong, Zheng, Wei, Aronson, Kristan J, Spinelli, John J, Brooks-Wilson, Angela R
Publication Abstract
Inflammation contributes to breast cancer development through its effects on cell damage. This damage is usually dealt with by key genes involved in apoptosis and autophagy pathways.

Methods in molecular biology (Clifton, N.J.), 2019
Authors
Alcaide, Miguel, Rushton, Christopher, Morin, Ryan D
Publication Abstract
Liquid biopsies are rapidly emerging as powerful tools for the early detection of cancer, noninvasive genomic profiling of localized or metastatic tumors, prompt detection of treatment resistance-associated mutations, and monitoring of therapeutic response and minimal residual disease in patients during clinical follow-up. Growing evidence strongly supports the utility of circulating tumor DNA (ctDNA) as a biomarker for the stratification and clinical management of lymphoma patients. However, ctDNA is diluted by variable amounts of cell-free DNA (cfDNA) shed by nonneoplastic cells causing a background signal of wild-type DNA that limits the sensitivity of methods that rely on DNA sequencing. Here, we describe an error suppression method for single-molecule counting that relies on targeted sequencing of cfDNA libraries constructed with semi-degenerate barcode adapters. Custom pools of biotinylated DNA baits for target enrichment can be designed to specifically track somatic mutations in one patient, survey mutation hotspots with diagnostic and prognostic value or be comprised of comprehensive gene panels with broad patient coverage in lymphoma. Such methods are amenable to track ctDNA levels during longitudinal liquid biopsy testing with high specificity and sensitivity and characterize, in real time, the genetic profiles of tumors without the need of standard invasive biopsies. The analysis of ultra-deep sequencing data according to the bioinformatics pipelines also described in this chapter affords to harness lower limits of detection for ctDNA below 0.1%.

Molecular therapy oncolytics, 2018
Authors
Huff, Amanda L, Wongthida, Phonphimon, Kottke, Timothy, Thompson, Jill M, Driscoll, Christopher B, Schuelke, Matthew, Shim, Kevin G, Harris, Reuben S, Molan, Amy, Pulido, Jose S, Selby, Peter J, Harrington, Kevin J, Melcher, Alan, Evgin, Laura, Vile, Richard G
Publication Abstract
Tumor cells frequently evade applied therapies through the accumulation of genomic mutations and rapid evolution. In the case of oncolytic virotherapy, understanding the mechanisms by which cancer cells develop resistance to infection and lysis is critical to the development of more effective viral-based platforms. Here, we identify APOBEC3 as an important factor that restricts the potency of oncolytic vesicular stomatitis virus (VSV). We show that VSV infection of B16 murine melanoma cells upregulated APOBEC3 in an IFN-β-dependent manner, which was responsible for the evolution of virus-resistant cell populations and suggested that APOBEC3 expression promoted the acquisition of a virus-resistant phenotype. Knockdown of APOBEC3 in B16 cells diminished their capacity to develop resistance to VSV infection and enhanced the therapeutic effect of VSV . Similarly, overexpression of human APOBEC3B promoted the acquisition of resistance to oncolytic VSV both and . Finally, we demonstrate that APOBEC3B expression had a direct effect on the fitness of VSV, an RNA virus that has not previously been identified as restricted by APOBEC3B. This research identifies APOBEC3 enzymes as key players to target in order to improve the efficacy of viral or broader nucleic acid-based therapeutic platforms.

Cold Spring Harbor molecular case studies, 2018
Authors
Ko, Jenny J, Grewal, Jasleen K, Ng, Tony, Lavoie, Jean-Michel, Thibodeau, My Linh, Shen, Yaoqing, Mungall, Andrew J, Taylor, Greg, Schrader, Kasmintan A, Jones, Steven J M, Kollmannsberger, Christian, Laskin, Janessa, Marra, Marco A
Publication Abstract
Thyroid-like follicular renal cell carcinoma (TLFRCC) is a rare cancer with few reports of metastatic disease. Little is known regarding genomic characteristics and therapeutic targets. We present the clinical, pathologic, genomic, and transcriptomic analyses of a case of a 27-yr-old male with TLFRCC who presented initially with bone metastases of unknown primary. Genomic DNA from peripheral blood and metastatic tumor samples were sequenced. A transcriptome of 280 million sequence reads was generated from the same tumor sample. Tumor somatic expression profiles were analyzed to detect aberrant expression. Genomic and transcriptomic data sets were integrated to reveal dysregulation in pathways and identify potential therapeutic targets. Integrative genomic analysis with The Cancer Genome Atlas (TCGA) data set revealed the following outliers in gene expression profiles: (81st percentile), (99th percentile), (100th percentile), and (99th and 100th percentiles, respectively), and (86th percentile). The patient received first-line sunitinib to target PDGFRA and PDGFRB and had stable disease for >6 mo, followed by nivolumab upon progression. To the authors' knowledge, this is the first reported case of comprehensive somatic genomic analyses in a patient with metastatic TLFRCC. Somatic analyses provided molecular confirmation of the primary site of cancer and potential therapeutic strategies in a rare disease with little evidence of efficacy on systemic therapy.

Genes, 2018
Authors
Taylor, Gregory A, Kirk, Heather, Coombe, Lauren, Jackman, Shaun D, Chu, Justin, Tse, Kane, Cheng, Dean, Chuah, Eric, Pandoh, Pawan, Carlsen, Rebecca, Zhao, Yongjun, Mungall, Andrew J, Moore, Richard, Birol, Inanc, Franke, Maria, Marra, Marco A, Dutton, Christopher, Jones, Steven J M
Publication Abstract
The grizzly bear ( ssp. ) represents the largest population of brown bears in North America. Its genome was sequenced using a microfluidic partitioning library construction technique, and these data were supplemented with sequencing from a nanopore-based long read platform. The final assembly was 2.33 Gb with a scaffold N50 of 36.7 Mb, and the genome is of comparable size to that of its close relative the polar bear (2.30 Gb). An analysis using 4104 highly conserved mammalian genes indicated that 96.1% were found to be complete within the assembly. An automated annotation of the genome identified 19,848 protein coding genes. Our study shows that the combination of the two sequencing modalities that we used is sufficient for the construction of highly contiguous reference quality mammalian genomes. The assembled genome sequence and the supporting raw sequence reads are available from the NCBI (National Center for Biotechnology Information) under the bioproject identifier PRJNA493656, and the assembly described in this paper is version QXTK01000000.

Cell stem cell, 2018
Authors
Giambra, Vincenzo, Gusscott, Samuel, Gracias, Deanne, Song, Raymond, Lam, Sonya H, Panelli, Patrizio, Tyshchenko, Kateryna, Jenkins, Catherine E, Hoofd, Catherine, Lorzadeh, Alireza, Carles, Annaick, Hirst, Martin, Eaves, Connie J, Weng, Andrew P
Publication Abstract
Acute leukemias are aggressive malignancies of developmentally arrested hematopoietic progenitors. We sought here to explore the possibility that changes in hematopoietic stem/progenitor cells during development might alter the biology of leukemias arising from this tissue compartment. Using a mouse model of acute T cell leukemia, we found that leukemias generated from fetal liver (FL) and adult bone marrow (BM) differed dramatically in their leukemia stem cell activity with FL leukemias showing markedly reduced serial transplantability as compared to BM leukemias. We present evidence that this difference is due to NOTCH1-driven autocrine IGF1 signaling, which is active in FL cells but restrained in BM cells by EZH2-dependent H3K27 trimethylation. Further, we confirmed this mechanism is operative in human disease and show that enforced IGF1 signaling effectively limits leukemia stem cell activity. These findings demonstrate that resurrecting dormant fetal programs in adult cells may represent an alternate therapeutic approach in human cancer.

BMC bioinformatics, 2018
Authors
Jackman, Shaun D, Coombe, Lauren, Chu, Justin, Warren, Rene L, Vandervalk, Benjamin P, Yeo, Sarah, Xue, Zhuyi, Mohamadi, Hamid, Bohlmann, Joerg, Jones, Steven J M, Birol, Inanc
Publication Abstract
Genome sequencing yields the sequence of many short snippets of DNA (reads) from a genome. Genome assembly attempts to reconstruct the original genome from which these reads were derived. This task is difficult due to gaps and errors in the sequencing data, repetitive sequence in the underlying genome, and heterozygosity. As a result, assembly errors are common. In the absence of a reference genome, these misassemblies may be identified by comparing the sequencing data to the assembly and looking for discrepancies between the two. Once identified, these misassemblies may be corrected, improving the quality of the assembled sequence. Although tools exist to identify and correct misassemblies using Illumina paired-end and mate-pair sequencing, no such tool yet exists that makes use of the long distance information of the large molecules provided by linked reads, such as those offered by the 10x Genomics Chromium platform. We have developed the tool Tigmint to address this gap.
Back to top