HuBMAP Publications

There are 60 publications.
Publish DateTitleAbstractAuthor(s)HuBMAP Component
2018-10-29Identification of spatially associated subpopulations by combining scRNAseq and sequential fluorescence in situ hybridization dataZhu Q, Shah S, Dries R, Cai L, Yuan GC.TTD-Cal TechHow intrinsic gene-regulatory networks interact with a cell's spatial environment to define its identity remains poorly understood. We developed an approach to distinguish between intrinsic and extrinsic effects on global gene expression by integrating analysis of sequencing-based and imaging-based single-cell transcriptomic profiles, using cross-platform cell type mapping combined with a hidden Markov random field model. We applied this approach to dissect the cell-type- and spatial-domain-associated heterogeneity in the mouse visual cortex region. Our analysis identified distinct spatially associated, cell-type-independent signatures in the glutamatergic and astrocyte cell compartments. Using these signatures to analyze single-cell RNA sequencing data, we identified previously unknown spatially associated subpopulations, which were validated by comparison with anatomical structures and Allen Brain Atlas images.
2018-12-11Forecasting innovations in science, technology, and educationBörner K, Rouse WB, Trunfio P, Stanley HE.HIVE MC-IUHuman survival depends on our ability to predict future outcomes so that we can make informed decisions. Human cognition and perception are optimized for local, short-term decision-making, such as deciding when to fight or flight, whom to mate, or what to eat. For more elaborate decisions (e.g., when to harvest, when to go to war or not, and whom to marry), people used to consult oracles—prophetic predictions of the future inspired by the gods. Over time, oracles were replaced by models of the structure and dynamics of natural, technological, and social systems. In the 21st century, computational models and visualizations of model results inform much of our decision-making: near real-time weather forecasts help us decide when to take an umbrella, plant, or harvest; where to ground airplanes; or when to evacuate inhabitants in the path of a hurricane, tornado, or flood. Long-term weather and climate forecasts predict a future with increasing torrential rains, stronger winds, and more frequent drought, landslides, and forest fires as well as rising sea levels, enabling decision makers to prepare for these changes by building dikes, moving cities and roads, and building larger water reservoirs and better storm sewers.
2018-12-19Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomicsStoeckius M, Zheng S, Houck-Loomis B, Hao S, Yeung BZ, Mauck WM, Smibert P, Satija R.HIVE MC-NYGCDespite rapid developments in single cell sequencing, sample-specific batch effects, detection of cell multiplets, and experimental costs remain outstanding challenges. Here, we introduce Cell Hashing, where oligo-tagged antibodies against ubiquitously expressed surface proteins uniquely label cells from distinct samples, which can be subsequently pooled. By sequencing these tags alongside the cellular transcriptome, we can assign each cell to its original sample, robustly identify cross-sample multiplets, and “super-load” commercial droplet-based systems for significant cost reduction. We validate our approach using a complementary genetic approach and demonstrate how hashing can generalize the benefits of single cell multiplexing to diverse samples and experimental designs.
2019-02-01Protein identification strategies in MALDI imaging mass spectrometry: a brief reviewRyan DJ, Spraggins JM, Caprioli RM.TMC-Vanderbilt (Kidney)Matrix assisted laser desorption/ionization (MALDI) imaging mass spectrometry (IMS) is a powerful technology used to investigate the spatial distributions of thousands of molecules throughout a tissue section from a single experiment. As proteins represent an important group of functional molecules in tissue and cells, the imaging of proteins has been an important point of focus in the development of IMS technologies and methods. Protein identification is crucial for the biological contextualization of molecular imaging data. However, gas-phase fragmentation efficiency of MALDI generated proteins presents significant challenges, making protein identification directly from tissue difficult. This review highlights methods and technologies specifically related to protein identification that have been developed to overcome these challenges in MALDI IMS experiments.
2019-02-05Data visualization literacy: Definitions, conceptual frameworks, exercises, and assessmentsBörner K, Bueckle A and Ginda M.HIVE MC-IUIn the information age, the ability to read and construct data visualizations becomes as important as the ability to read and write text. However, while standard definitions and theoretical frameworks to teach and assess textual, mathematical, and visual literacy exist, current data visualization literacy (DVL) definitions and frameworks are not comprehensive enough to guide the design of DVL teaching and assessment. This paper introduces a data visualization literacy framework (DVL-FW) that was specifically developed to define, teach, and assess DVL. The holistic DVL-FW promotes both the reading and construction of data visualizations, a pairing analogous to that of both reading and writing in textual literacy and understanding and applying in mathematical literacy. Specifically, the DVL-FW defines a hierarchical typology of core concepts and details the process steps that are required to extract insights from data. Advancing the state of the art, the DVL-FW interlinks theoretical and procedural knowledge and showcases how both can be combined to design curricula and assessment measures for DVL. Earlier versions of the DVL-FW have been used to teach DVL to more than 8,500 residential and online students, and results from this effort have helped revise and validate the DVL-FW presented here.
2019-02-15Dhaka: Variational Autoencoder for Unmasking Tumor Heterogeneity from Single Cell Genomic DataRashid S, Shah S, Bar-Joseph Z, Pandya R.HIVE TC-CMUMOTIVATION: Intra-tumor heterogeneity is one of the key confounding factors in deciphering tumor evolution. Malignant cells exhibit variations in their gene expression, copy numbers, and mutation even when originating from a single progenitor cell. Single cell sequencing of tumor cells has recently emerged as a viable option for unmasking the underlying tumor heterogeneity. However, extracting features from single cell genomic data in order to infer their evolutionary trajectory remains computationally challenging due to the extremely noisy and sparse nature of the data. RESULTS: Here we describe 'Dhaka', a variational autoencoder method which transforms single cell genomic data to a reduced dimension feature space that is more efficient in differentiating between (hidden) tumor subpopulations. Our method is general and can be applied to several different types of genomic data including copy number variation from scDNA-Seq and gene expression from scRNA-Seq experiments. We tested the method on synthetic and 6 single cell cancer datasets where the number of cells ranges from 250 to 6000 for each sample. Analysis of the resulting feature space revealed subpopulations of cells and their marker genes. The features are also able to infer the lineage and/or differentiation trajectory between cells greatly improving upon prior methods suggested for feature extraction and dimensionality reduction of such data. AVAILABILITY AND IMPLEMENTATION: All the datasets used in the paper are publicly available and developed software package and supporting info is available on Github
2019-02-20The single-cell transcriptional landscape of mammalian organogenesisCao J, Spielmann M, Qiu X, Huang X, Ibrahim DM, Hill AJ, Zhang F, Mundlos S, Christiansen L, Steemers FJ, Trapnell C & Shendure JTMC-Cal TechMammalian organogenesis is a remarkable process. Within a short timeframe, the cells of the three germ layers transform into an embryo that includes most of the major internal and external organs. Here we investigate the transcriptional dynamics of mouse organogenesis at single-cell resolution. Using single-cell combinatorial indexing, we profiled the transcriptomes of around 2 million cells derived from 61 embryos staged between 9.5 and 13.5 days of gestation, in a single experiment. The resulting ‘mouse organogenesis cell atlas’ (MOCA) provides a global view of developmental processes during this critical window. We use Monocle 3 to identify hundreds of cell types and 56 trajectories, many of which are detected only because of the depth of cellular coverage, and collectively define thousands of corresponding marker genes. We explore the dynamics of gene expression within cell types and trajectories over time, including focused analyses of the apical ectodermal ridge, limb mesenchyme and skeletal muscle.
2019-03-01Multiple TOF/TOF Events in a Single Laser Shot for Multiplexed Lipid Identifications in MALDI Imaging Mass SpectrometryPrentice BM, McMillen JC, Caprioli RMTMC-Vanderbilt (Kidney)Tandem mass spectrometry (MS/MS) is often used to identify lipids in matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) workflows. The molecular specificity afforded by MS/MS is crucial on MALDI time-of-flight (TOF) platforms that generally lack high resolution accurate mass measurement capabilities. Unfortunately, imaging MS/MS workflows generally only monitor a single precursor ion over the imaged area, limiting the throughput of this methodology. Herein, we demonstrate that multiple TOF/TOF events performed in each laser shot can be used to improve the throughput of imaging MS/MS. This is shown to enable the simultaneous identification of multiple phosphatidylcholine lipids in rat brain tissue. Uniquely, the separation in time achieved for the precursor ions in the TOF-1 region of the instrument is maintained for the fragment ions as they are analyzed in TOF-2, allowing for the differentiation of fragment ions of the exact same m/z derived from different precursor ions (e.g., the m/z 163 fragment ion from precursor ion m/z 772.5 is easily distinguished from the m/z 163 fragment ion from precursor ion m/z 826.5). This multiplexed imaging MS/MS approach allows for the acquisition of complete fragment ion spectra for multiple precursor ions per laser shot.
2019-03-25Transcriptome-scale super-resolved imaging in tissues by RNA seqFISHEng CL, Lawson M, Zhu Q, Dries R, Koulena N, Takei Y, Yun J, Cronin C, Karp C, Yuan GC, Cai L.TTD-Cal TechImaging the transcriptome in situ with high accuracy has been a major challenge in single-cell biology, which is particularly hindered by the limits of optical resolution and the density of transcripts in single cells. Here we demonstrate an evolution of sequential fluorescence in situ hybridization (seqFISH+). We show that seqFISH+ can image mRNAs for 10,000 genes in single cells-with high accuracy and sub-diffraction-limit resolution-in the cortex, subventricular zone and olfactory bulb of mouse brain, using a standard confocal microscope. The transcriptome-level profiling of seqFISH+ allows unbiased identification of cell classes and their spatial organization in tissues. In addition, seqFISH+ reveals subcellular mRNA localization patterns in cells and ligand-receptor pairs across neighbouring cells. This technology demonstrates the ability to generate spatial cell atlases and to perform discovery-driven studies of biological processes in situ.
2019-04-06Imaging mass spectrometry enables molecular profiling of mouse and human pancreatic tissuePrentice BM, Hart NJ, Phillips N, Haliyur R, Judd A, Armandala R, Spraggins JM, Lowe CL, Boyd KL, Stein RW, Wright CV, Norris JL, Powers AC, Brissova M, Caprioli RM.TMC-Vanderbilt (Kidney)The molecular response and function of pancreatic islet cells during metabolic stress is a complex process. The anatomical location and small size of pancreatic islets coupled with current methodological limitations have prevented the achievement of a complete, coherent picture of the role that lipids and proteins play in cellular processes under normal conditions and in diseased states. Herein, we describe the development of untargeted tissue imaging mass spectrometry (IMS) technologies for the study of in situ protein and, more specifically, lipid distributions in murine and human pancreases.
2019-04-19The Importance of Clinical Tissue ImagingSpraggins JM, Schwamborn K, Heeren RMA, Eberlin LS.TMC-Vanderbilt (Kidney)Tissue imaging by mass spectrometry (MS) combines the sensitivity and molecular specificity of MS with the spatial fidelity of classical histology for analysis of metabolites, lipids and proteins in tissues (Fig. 1). MS-based imaging is label-free, untargeted, sensitive, and specific, thereby enabling application in both basic biomedical research and the clinical laboratory. While all tissue imaging experiments are conceptually similar in their ability to generate spatial molecular data; ionization, data collection, and purpose vary widely. Here, we highlight recent technical advances and efforts that are motivating translational applications of this emerging technology.
2019-05-01SABER amplifies FISH: enhanced multiplexed imaging of RNA and DNA in cells and tissuesKishi JY, Lapan SW, Beliveau BJ, West ER, Zhu A, Sasaki HM, Saka SK, Wang Y, Cepko CL, Yin P.TTD-HarvardFluorescence in situ hybridization (FISH) reveals the abundance and positioning of nucleic acid sequences in fixed samples. Despite recent advances in multiplexed amplification of FISH signals, it remains challenging to achieve high levels of simultaneous amplification and sequential detection with high sampling efficiency and simple workflows. Here we introduce signal amplification by exchange reaction (SABER), which endows oligonucleotide-based FISH probes with long, single-stranded DNA concatemers that aggregate a multitude of short complementary fluorescent imager strands. We show that SABER amplified RNA and DNA FISH signals (5- to 450-fold) in fixed cells and tissues. We also applied 17 orthogonal amplifiers against chromosomal targets simultaneously and detected mRNAs with high efficiency. We then used 10-plex SABER-FISH to identify in vivo introduced enhancers with cell-type-specific activity in the mouse retina. SABER represents a simple and versatile molecular toolkit for rapid and cost-effective multiplexed imaging of nucleic acid targets.
2019-05-06Visualizing learner engagement, performance, and trajectories to evaluate and optimize online course designGinda M, Richey MC, Cousino M, Börner K.HIVE MC-IULearning analytics and visualizations make it possible to examine and communicate learners’ engagement, performance, and trajectories in online courses to evaluate and optimize course design for learners. This is particularly valuable for workforce training involving employees who need to acquire new knowledge in the most effective manner. This paper introduces a set of metrics and visualizations that aim to capture key dynamical aspects of learner engagement, performance, and course trajectories. The metrics are applied to identify prototypical behavior and learning pathways through and interactions with course content, activities, and assessments. The approach is exemplified and empirically validated using more than 30 million separate logged events that capture activities of 1,608 Boeing engineers taking the MITxPro Course, “Architecture of Complex Systems,” delivered in Fall 2016. Visualization results show course structure and patterns of learner interactions with course material, activities, and assessments. Tree visualizations are used to represent course hierarchical structures and explicit sequence of content modules. Learner trajectory networks represent pathways and interactions of individual learners through course modules, revealing patterns of learner engagement, content access strategies, and performance. Results provide evidence for instructors and course designers for evaluating the usage and effectiveness of course materials and intervention strategies.
2019-06-04Cell lineage inference from SNP and scRNA-Seq dataDing J, Lin C, Bar-Joseph Z.HIVE TC-CMUSeveral recent studies focus on the inference of developmental and response trajectories from single cell RNA-Seq (scRNA-Seq) data. A number of computational methods, often referred to as pseudo-time ordering, have been developed for this task. Recently, CRISPR has also been used to reconstruct lineage trees by inserting random mutations. However, both approaches suffer from drawbacks that limit their use. Here, we develop a method to detect significant, cell type specific, sequence mutations from scRNA-Seq data. We show that only a few mutations are enough for reconstructing good branching models. Integrating these mutations with expression data further improves the accuracy of the reconstructed models. As we show, the majority of mutations we identify are likely RNA editing events indicating that such information can be used to distinguish cell types.
2019-06-06Comprehensive Integration of Single-Cell DataStuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM 3rd, Hao Y, Stoeckius M, Smibert P, Satija R.HIVE MC-NYGCSingle-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to “anchor” diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.
2019-06-07The human body at cellular resolution: the NIH Human Biomolecular Atlas ProgramHuBMAP ConsortiumConsortiumTransformative technologies are enabling the construction of three-dimensional maps of tissues with unprecedented spatial and molecular resolution. Over the next seven years, the NIH Common Fund Human Biomolecular Atlas Program (HuBMAP) intends to develop a widely accessible framework for comprehensively mapping the human body at single-cell resolution by supporting technology development, data acquisition, and detailed spatial mapping. HuBMAP will integrate its efforts with other funding agencies, programs, consortia, and the biomedical research community at large towards the shared vision of a comprehensive, accessible three-dimensional molecular and cellular atlas of the human body, in health and under various disease conditions.
2019-06-18MicroLESA: Integrating Autofluorescence Microscopy, In Situ Micro-Digestions, and Liquid Extraction Surface Analysis for High Spatial Resolution Targeted Proteomic Studies.Ryan DJ, Patterson NH, Putnam NE, Wilde AD, Weiss A, Perry WJ, Cassat JE, Skaar EP, Caprioli RM, Spraggins JM.TMC-Vanderbilt (Kidney)The ability to target discrete features within tissue using liquid surface extractions enables the identification of proteins while maintaining the spatial integrity of the sample. Here, we present a liquid extraction surface analysis (LESA) workflow, termed microLESA, that allows proteomic profiling from discrete tissue features of ∼110 μm in diameter by integrating nondestructive autofluorescence microscopy and spatially targeted liquid droplet micro-digestion. Autofluorescence microscopy provides the visualization of tissue foci without the need for chemical stains or the use of serial tissue sections. Tryptic peptides are generated from tissue foci by applying small volume droplets (∼250 pL) of enzyme onto the surface prior to LESA. The microLESA workflow reduced the diameter of the sampled area almost 5-fold compared to previous LESA approaches. Experimental parameters, such as tissue thickness, trypsin concentration, and enzyme incubation duration, were tested to maximize proteomics analysis. The microLESA workflow was applied to the study of fluorescently labeled Staphylococcus aureus infected murine kidney to identify unique proteins related to host defense and bacterial pathogenesis. Proteins related to nutritional immunity and host immune response were identified by performing microLESA at the infectious foci and surrounding abscess. These identifications were then used to annotate specific proteins observed in infected kidney tissue by MALDI FT-ICR IMS through accurate mass matching.
2019-06-19The 2019 mathematical oncology roadmapRockne RC, Hawkins-Daarud A, Swanson KR, Sluka JP, Glazier JA, Macklin P, Hormuth DA, Jarrett AM, Lima EABF, Tinsley Oden J, Biros G, Yankeelov TE, Curtius K, Al Bakir I, Wodarz D, Komarova N, Aparicio L, Bordyuh M, Rabadan R, Finley SD, Enderling H, Caudell J, et al.HIVE MC-IUWhether the nom de guerre is Mathematical Oncology, Computational or Systems Biology, Theoretical Biology, Evolutionary Oncology, Bioinformatics, or simply Basic Science, there is no denying that mathematics continues to play an increasingly prominent role in cancer research. Mathematical Oncology—defined here simply as the use of mathematics in cancer research—complements and overlaps with a number of other fields that rely on mathematics as a core methodology. As a result, Mathematical Oncology has a broad scope, ranging from theoretical studies to clinical trials designed with mathematical models. This Roadmap differentiates Mathematical Oncology from related fields and demonstrates specific areas of focus within this unique field of research. The dominant theme of this Roadmap is the personalization of medicine through mathematics, modelling, and simulation. This is achieved through the use of patient-specific clinical data to: develop individualized screening strategies to detect cancer earlier; make predictions of response to therapy; design adaptive, patient-specific treatment plans to overcome therapy resistance; and establish domain-specific standards to share model predictions and to make models and simulations reproducible. The cover art for this Roadmap was chosen as an apt metaphor for the beautiful, strange, and evolving relationship between mathematics and cancer.
2019-08-19Immuno-SABER enables highly multiplexed and amplified protein imaging in tissuesSaka SK, Wang Y, Kishi JY, Zhu A, Zeng Y, Xie W, Kirli K, Yapp C, Cicconet M, Beliveau BJ, Lapan SW, Yin S, Lin M, Boyden ES, Kaeser PS, Pihan G, Church GM, Yin P.TTD-HarvardSpatial mapping of proteins in tissues is hindered by limitations in multiplexing, sensitivity and throughput. Here we report immunostaining with signal amplification by exchange reaction (Immuno-SABER), which achieves highly multiplexed signal amplification via DNA-barcoded antibodies and orthogonal DNA concatemers generated by primer exchange reaction (PER). SABER offers independently programmable signal amplification without in situ enzymatic reactions, and intrinsic scalability to rapidly amplify and visualize a large number of targets when combined with fast exchange cycles of fluorescent imager strands. We demonstrate 5- to 180-fold signal amplification in diverse samples (cultured cells, cryosections, formalin-fixed paraffin-embedded sections and whole-mount tissues), as well as simultaneous signal amplification for ten different proteins using standard equipment and workflows. We also combined SABER with expansion microscopy to enable rapid, multiplexed super-resolution tissue imaging. Immuno-SABER presents an effective and accessible platform for multiplexed and amplified imaging of proteins with high sensitivity and throughput.
2019-09-02A pooled single-cell genetic screen identifies regulatory checkpoints in the continuum of the epithelial-to-mesenchymal transitionMcFaline-Figueroa JL, Hill AJ, Qiu X, Jackson D, Shendure J, Trapnell C.TMC-Cal TechIntegrating single-cell trajectory analysis with pooled genetic screening could reveal the genetic architecture that guides cellular decisions in development and disease. We applied this paradigm to probe the genetic circuitry that controls epithelial-to-mesenchymal transition (EMT). We used single-cell RNA sequencing to profile epithelial cells undergoing a spontaneous spatially determined EMT in the presence or absence of transforming growth factor-β. Pseudospatial trajectory analysis identified continuous waves of gene regulation as opposed to discrete ‘partial’ stages of EMT. KRAS was connected to the exit from the epithelial state and the acquisition of a fully mesenchymal phenotype. A pooled single-cell CRISPR-Cas9 screen identified EMT-associated receptors and transcription factors, including regulators of KRAS, whose loss impeded progress along the EMT. Inhibiting the KRAS effector MEK and its upstream activators EGFR and MET demonstrates that interruption of key signaling events reveals regulatory ‘checkpoints’ in the EMT continuum that mimic discrete stages, and reconciles opposing views of the program that controls EMT.
2019-09-10High-Parameter Immune Profiling with CyTOFSahaf B, Rahman A, Maecker HT, Bendall SC.RTI-StanfordMass cytometry, or CyTOF, is a useful technology for high-parameter single-cell phenotyping, especially from suspension cells such as blood or PBMC. It is particularly appealing to monitor the systemic immune changes that could accompany cancer immunotherapy. Here we present a reference panel for identification of all major immune cell populations, with flexibility for addition of trial-specific markers. We also describe best-practice measures for minimizing and tracking batch variability. These include: sample barcoding, use of spiked-in reference cells, and lyophilization of the antibody cocktail. Our protocol assumes the use of cryopreserved PBMC, both for convenience of batching samples and for maximum comparability across patients and time points. Finally, we show an option for automated analysis using the Astrolabe platform (Astrolabe Diagnostics, Inc.).
2019-09-10Supervised classification enables rapid annotation of cell atlasesPliner HA, Shendure J, Trapnell C.TMC-Cal TechSingle-cell molecular profiling technologies are gaining rapid traction, but the manual process by which resulting cell types are typically annotated is labor intensive and rate-limiting. We describe Garnett, a tool for rapidly annotating cell types in single-cell transcriptional profiling and single-cell chromatin accessibility datasets, based on an interpretable, hierarchical markup language of cell type-specific genes. Garnett successfully classifies cell types in tissue and whole organism datasets, as well as across species.
2019-10-07GiniClust3: a fast and memory-efficient tool for rare cell type identificationDong R, Yuan GC.TTD-Cal TechBACKGROUND: With the rapid development of single-cell RNA sequencing technology, it is possible to dissect cell-type composition at high resolution. A number of methods have been developed with the purpose to identify rare cell types. However, existing methods are still not scalable to large datasets, limiting their utility. To overcome this limitation, we present a new software package, called GiniClust3, which is an extension of GiniClust2 and significantly faster and memory-efficient than previous versions. RESULTS: Using GiniClust3, it only takes about 7 h to identify both common and rare cell clusters from a dataset that contains more than one million cells. Cell type mapping and perturbation analyses show that GiniClust3 could robustly identify cell clusters. CONCLUSIONS: Taken together, these results suggest that GiniClust3 is a powerful tool to identify both common and rare cell population and can handle large dataset. GiniCluster3 is implemented in the open-source python package and available at
2019-10-08High-Performance Molecular Imaging with MALDI Trapped Ion-Mobility Time-of-Flight (timsTOF) Mass SpectrometrySpraggins JM, Djambazova KV, Rivera ES, Migas LG, Neumann EK, Fuetterer A, Suetering J, Goedecke N, Ly A, Van de Plas R, Caprioli RM.TMC-Vanderbilt (Kidney)Understanding the genetic and molecular drivers of phenotypic heterogeneity across individuals is central to biology. As new technologies enable fine-grained and spatially resolved molecular profiling, we need new computational approaches to integrate data from the same organ across different individuals into a consistent reference and to construct maps of molecular and cellular organization at histological and anatomical scales. Here, we review previous efforts and discuss challenges involved in establishing such a common coordinate framework, the underlying map of tissues and organs. We focus on strategies to handle anatomical variation across individuals and highlight the need for new technologies and analytical methods spanning multiple hierarchical scales of spatial resolution.
2019-10-11Unsupervised machine learning for exploratory data analysis in imaging mass spectrometry. Screen reader support enabledVerbeeck N, Caprioli RM, Van de Plas R.TMC-Vanderbilt (Kidney)Imaging mass spectrometry (IMS) is a rapidly advancing molecular imaging modality that can map the spatial distribution of molecules with high chemical specificity. IMS does not require prior tagging of molecular targets and is able to measure a large number of ions concurrently in a single experiment. While this makes it particularly suited for exploratory analysis, the large amount and high‐dimensional nature of data generated by IMS techniques make automated computational analysis indispensable. Research into computational methods for IMS data has touched upon different aspects, including spectral preprocessing, data formats, dimensionality reduction, spatial registration, sample classification, differential analysis between IMS experiments, and data‐driven fusion methods to extract patterns corroborated by both IMS and other imaging modalities. In this work, we review unsupervised machine learning methods for exploratory analysis of IMS data, with particular focus on (a) factorization, (b) clustering, and (c) manifold learning. To provide a view across the various IMS modalities, we have attempted to include examples from a range of approaches including matrix assisted laser desorption/ionization, desorption electrospray ionization, and secondary ion mass spectrometry‐based IMS. This review aims to be an entry point for both (i) analytical chemists and mass spectrometry experts who want to explore computational techniques; and (ii) computer scientists and data mining specialists who want to enter the IMS field.
2019-10-14High-throughput sequencing of the transcriptome and chromatin accessibility in the same cellChen S, Lake BB, Zhang K.TMC-UCSDSingle-cell RNA sequencing can reveal the transcriptional state of cells, yet provides little insight into the upstream regulatory landscape associated with open or accessible chromatin regions. Joint profiling of accessible chromatin and RNA within the same cells would permit direct matching of transcriptional regulation to its outputs. Here, we describe droplet-based single-nucleus chromatin accessibility and mRNA expression sequencing (SNARE-seq), a method that can link a cell’s transcriptome with its accessible chromatin for sequencing at scale. Specifically, accessible sites are captured by Tn5 transposase in permeabilized nuclei to permit, within many droplets in parallel, DNA barcode tagging together with the mRNA molecules from the same cells. To demonstrate the utility of SNARE-seq, we generated joint profiles of 5,081 and 10,309 cells from neonatal and adult mouse cerebral cortices, respectively. We reconstructed the transcriptome and epigenetic landscapes of major and rare cell types, uncovered lineage-specific accessible sites, especially for low-abundance cells, and connected the dynamics of promoter accessibility with transcription level during neurogenesis.
2019-11-13High spatial resolution imaging of biological tissues using nanospray desorption electrospray ionization mass spectrometryYin R, Burnum-Johnson KE, Sun X, Dey SK & Laskin JTTD-Purdue
2019-11-15Continuous State HMMs for Modeling Time Series Single Cell RNA-Seq DataLin C, Bar-Joseph Z.HIVE TC-CMUMOTIVATION: Methods for reconstructing developmental trajectories from time series single cell RNA-Seq (scRNA-Seq) data can be largely divided into two categories. The first, often referred to as pseudotime ordering methods, are deterministic and rely on dimensionality reduction followed by an ordering step. The second learns a probabilistic branching model to represent the developmental process. While both types have been successful, each suffers from shortcomings that can impact their accuracy. RESULTS: We developed a new method based on continuous state HMMs (CSHMMs) for representing and modeling time series scRNA-Seq data. We define the CSHMM model and provide efficient learning and inference algorithms which allow the method to determine both the structure of the branching process and the assignment of cells to these branches. Analyzing several developmental single cell datasets we show that the CSHMM method accurately infers branching topology and correctly and continuously assign cells to paths, improving upon prior methods proposed for this task. Analysis of genes based on the continuous cell assignment identifies known and novel markers for different cell types. AVAILABILITY: Software and Supporting website: SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
2019-12-12Toward a Common Coordinate Framework for the Human BodyRood JE, Stuart T, Ghazanfar S, Biancalani T, Fisher E, Butler A, Hupalowska A, Gaffney L, Mauck W, Eraslan G, Marioni JC, Regev A, Satija R.HIVE MC-NYGCUnderstanding the genetic and molecular drivers of phenotypic heterogeneity across individuals is central to biology. As new technologies enable fine-grained and spatially resolved molecular profiling, we need new computational approaches to integrate data from the same organ across different individuals into a consistent reference and to construct maps of molecular and cellular organization at histological and anatomical scales. Here, we review previous efforts and discuss challenges involved in establishing such a common coordinate framework, the underlying map of tissues and organs. We focus on strategies to handle anatomical variation across individuals and highlight the need for new technologies and analytical methods spanning multiple hierarchical scales of spatial resolution.
2019-12-20Uncovering matrix effects on lipid analyses in MALDI imaging mass spectrometry experimentsPerry WJ, Patterson NH, Prentice BM, Neumann EK, Caprioli RM, Spraggins JM.TMC-Vanderbilt (Kidney)The specific matrix used in matrix‐assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) can have an effect on the molecules ionized from a tissue sample. The sensitivity for distinct classes of biomolecules can vary when employing different MALDI matrices. Here, we compare the intensities of various lipid subclasses measured by Fourier transform ion cyclotron resonance (FT‐ICR) IMS of murine liver tissue when using 9‐aminoacridine (9AA), 5‐chloro‐2‐mercaptobenzothiazole (CMBT), 1,5‐diaminonaphthalene (DAN), 2,5‐Dihydroxyacetophenone (DHA), and 2,5‐dihydroxybenzoic acid (DHB). Principal component analysis and receiver operating characteristic curve analysis revealed significant matrix effects on the relative signal intensities observed for different lipid subclasses and adducts. Comparison of spectral profiles and quantitative assessment of the number and intensity of species from each lipid subclass showed that each matrix produces unique lipid signals. In positive ion mode, matrix application methods played a role in the MALDI analysis for different cationic species. Comparisons of different methods for the application of DHA showed a significant increase in the intensity of sodiated and potassiated analytes when using an aerosol sprayer. In negative ion mode, lipid profiles generated using DAN were significantly different than all other matrices tested. This difference was found to be driven by modification of phosphatidylcholines during ionization that enables them to be detected in negative ion mode. These modified phosphatidylcholines are isomeric with common phosphatidylethanolamines confounding MALDI IMS analysis when using DAN. These results show an experimental basis of MALDI analyses when analyzing lipids from tissue and allow for more informed selection of MALDI matrices when performing lipid IMS experiments.
2019-12-23Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regressionHafemeister C, Satija R.HIVE MC-NYGCSingle-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from “regularized negative binomial regression,” where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.
2019-12-26Deep learning for inferring gene relationships from single-cell expression dataYuan Y, Bar-Joseph Z.HIVE TC-CMUSeveral methods were developed to mine gene–gene relationships from expression data. Examples include correlation and mutual information methods for coexpression analysis, clustering and undirected graphical models for functional assignments, and directed graphical models for pathway reconstruction. Using an encoding for gene expression data, followed by deep neural networks analysis, we present a framework that can successfully address all of these diverse tasks. We show that our method, convolutional neural network for coexpression (CNNC), improves upon prior methods in tasks ranging from predicting transcription factor targets to identifying disease-related genes to causality inference. CNNC’s encoding provides insights about some of the decisions it makes and their biological basis. CNNC is flexible and can easily be extended to integrate additional types of genomics data, leading to further improvements in its performance.
2019-12-31Immune monitoring using mass cytometry and related high-dimensional imaging approachesHartmann FJ, Bendall SC.RTI-StanfordThe cellular complexity and functional diversity of the human immune system necessitate the use of high-dimensional single-cell tools to uncover its role in multifaceted diseases such as rheumatic diseases, as well as other autoimmune and inflammatory disorders. Proteomic technologies that use elemental (heavy metal) reporter ions, such as mass cytometry (also known as CyTOF) and analogous high-dimensional imaging approaches (including multiplexed ion beam imaging (MIBI) and imaging mass cytometry (IMC)), have been developed from their low-dimensional counterparts, flow cytometry and immunohistochemistry, to meet this need. A growing number of studies have been published that use these technologies to identify functional biomarkers and therapeutic targets in rheumatic diseases, but the full potential of their application to rheumatic disease research has yet to be fulfilled. This Review introduces the underlying technologies for high-dimensional immune monitoring and discusses aspects necessary for their successful implementation, including study design principles, analytical tools and future developments for the field of rheumatology.
2020-01-07Automated mass spectrometry imaging of over 2000 proteins from tissue sections at 100-μm spatial resolutionPiehowski PD, Zhu Y, Bramer LM, Stratton KG, Zhao R, Orton DJ, Moore RJ, Yuan J, Mitchell HD, Gao Y, Webb-Robertson BM, Dey SK, Kelly RT, Burnum-Johnson KE.TTD-Purdue
2020-02-18Inferring TF activation order in time series scRNA-Seq studiesLin C, Ding J, Bar-Joseph Z.HIVE TC-CMUMethods for the analysis of time series single cell expression data (scRNA-Seq) either do not utilize information about transcription factors (TFs) and their targets or only study these as a post-processing step. Using such information can both, improve the accuracy of the reconstructed model and cell assignments, while at the same time provide information on how and when the process is regulated. We developed the Continuous-State Hidden Markov Models TF (CSHMM-TF) method which integrates probabilistic modeling of scRNA-Seq data with the ability to assign TFs to specific activation points in the model. TFs are assumed to influence the emission probabilities for cells assigned to later time points allowing us to identify not just the TFs controlling each path but also their order of activation. We tested CSHMM-TF on several mouse and human datasets. As we show, the method was able to identify known and novel TFs for all processes, assigned time of activation agrees with both expression information and prior knowledge and combinatorial predictions are supported by known interactions. We also show that CSHMM-TF improves upon prior methods that do not utilize TF-gene interaction
2020-03-11Multiplexed single-cell morphometry for hematopathology diagnosticsTsai AG, Glass DR, Juntilla M, Hartmann FJ, Oak JS, Fernandez-Pol S, Ohgami RS, Bendall SC.RTI-StanfordThe diagnosis of lymphomas and leukemias requires hematopathologists to integrate microscopically visible cellular morphology with antibody-identified cell surface molecule expression. To merge these into one high-throughput, highly multiplexed, single-cell assay, we quantify cell morphological features by their underlying, antibody-measurable molecular components, which empowers mass cytometers to ‘see’ like pathologists. When applied to 71 diverse clinical samples, single-cell morphometric profiling reveals robust and distinct patterns of ‘morphometric’ markers for each major cell type. Individually, lamin B1 highlights acute leukemias, lamin A/C helps distinguish normal from neoplastic mature T cells, and VAMP-7 recapitulates light-cytometric side scatter. Combined with machine learning, morphometric markers form intuitive visualizations of normal and neoplastic cellular distribution and differentiation. When recalibrated for myelomonocytic blast enumeration, this approach is superior to flow cytometry and comparable to expert microscopy, bypassing years of specialized training. The contextualization of traditional surface markers on independent morphometric frameworks permits more sensitive and automated diagnosis of complex hematopoietic diseases.
2020-03-13Considerations for Using the Vasculature as a Coordinate System to Map All the Cells in the Human BodyWeber, GM, Ju, Y, Börner K.HIVE MC-IUSeveral ongoing international efforts are developing methods of localizing single cells within organs or mapping the entire human body at the single cell level, including the Chan Zuckerberg Initiative’s Human Cell Atlas (HCA), and the Knut and Allice Wallenberg Foundation’s Human Protein Atlas (HPA), and the National Institutes of Health’s Human BioMolecular Atlas Program (HuBMAP). Their goals are to understand cell specialization, interactions, spatial organization in their natural context, and ultimately the function of every cell within the body. In the same way that the Human Genome Project had to assemble sequence data from different people to construct a complete sequence, multiple centers around the world are collecting tissue specimens from diverse populations that vary in age, race, sex, and body size. A challenge will be combining these heterogeneous tissue samples into a 3D reference map that will enable multiscale, multidimensional Google Maps-like exploration of the human body. Key to making alignment of tissue samples work is identifying and using a coordinate system called a Common Coordinate Framework (CCF), which defines the positions, or “addresses,” in a reference body, from whole organs down to functional tissue units and individual cells. In this perspective, we examine the concept of a CCF based on the vasculature and describe why it would be an attractive choice for mapping the human body.
2020-03-27Tools for the analysis of high-dimensional single-cell RNA sequencing dataWu Y, Zhang K.TMC-UCSDBreakthroughs in the development of high-throughput technologies for profiling transcriptomes at the single-cell level have helped biologists to understand the heterogeneity of cell populations, disease states and developmental lineages. However, these single-cell RNA sequencing (scRNA-seq) technologies generate an extraordinary amount of data, which creates analysis and interpretation challenges. Additionally, scRNA-seq datasets often contain technical sources of noise owing to incomplete RNA capture, PCR amplification biases and/or batch effects specific to the patient or sample. If not addressed, this technical noise can bias the analysis and interpretation of the data. In response to these challenges, a suite of computational tools has been developed to process, analyse and visualize scRNA-seq datasets. Although the specific steps of any given scRNA-seq analysis might differ depending on the biological questions being asked, a core workflow is used in most analyses. Typically, raw sequencing reads are processed into a gene expression matrix that is then normalized and scaled to remove technical noise. Next, cells are grouped according to similarities in their patterns of gene expression, which can be summarized in two or three dimensions for visualization on a scatterplot. These data can then be further analysed to provide an in-depth view of the cell types or developmental trajectories in the sample of interest.
2020-04-01Integrated molecular imaging technologies for investigation of metals in biological systems: A brief reviewPerry WJ, Weiss A, Van de Plas R, Spraggins JM, Caprioli RM, Skaar EP.TMC-Vanderbilt (Kidney)Metals play an essential role in biological systems and are required as structural or catalytic co-factors in many proteins. Disruption of the homeostatic control and/or spatial distributions of metals can lead to disease. Imaging technologies have been developed to visualize elemental distributions across a biological sample. Measurement of elemental distributions by imaging mass spectrometry and imaging X-ray fluorescence are increasingly employed with technologies that can assess histological features and molecular compositions. Data from several modalities can be interrogated as multimodal images to correlate morphological, elemental, and molecular properties. Elemental and molecular distributions have also been axially resolved to achieve three-dimensional volumes, dramatically increasing the biological information. In this review, we provide an overview of recent developments in the field of metal imaging with an emphasis on multimodal studies in two and three dimensions. We specifically highlight studies that present technological advancements and biological applications of how metal homeostasis affects human health.
2020-04-02Reconstructed Single-Cell Fate Trajectories Define Lineage Plasticity Windows during Differentiation of Human PSC-Derived Distal Lung ProgenitorsHurley K, Ding J, Villacorta-Martin C, Herriges MJ, Jacob A, Vedaie M, Alysandratos KD, Sun YL, Lin C, Werder RB, Huang J, Wilson AA, Mithal A, Mostoslavsky G, Oglesby I, Caballero IS, Guttentag SH, Ahangari F, Kaminski N, Rodriguez-Fraticelli A, Camargo F, Bar-Joseph Z, Kotton DN.HIVE TC-CMUAlveolar epithelial type 2 cells (AEC2s) are the facultative progenitors responsible for maintaining lung alveoli throughout life but are difficult to isolate from patients. Here, we engineer AEC2s from human pluripotent stem cells (PSCs) in vitro and use time-series single-cell RNA sequencing with lentiviral barcoding to profile the kinetics of their differentiation in comparison to primary fetal and adult AEC2 benchmarks. We observe bifurcating cell-fate trajectories as primordial lung progenitors differentiate in vitro, with some progeny reaching their AEC2 fate target, while others diverge to alternative non-lung endodermal fates. We develop a Continuous State Hidden Markov model to identify the timing and type of signals, such as overexuberant Wnt responses, that induce some early multipotent NKX2-1+ progenitors to lose lung fate. Finally, we find that this initial developmental plasticity is regulatable and subsides over time, ultimately resulting in PSC-derived AEC2s that exhibit a stable phenotype and nearly limitless self-renewal capacity.
2020-05-14Use of Single Cell -omic Technologies to Study the Gastrointestinal Tract and Diseases, From Single Cell Identities to Patient FeaturesIslam M, Chen B, Spraggins JM, Kelly RT, Lau KS.TMC-Vanderbilt (Kidney)Single cells are the building blocks of tissue systems that determine organ phenotypes, behaviors, and function. Understanding the differences between cell types and their activities might provide us with insights into normal tissue functions, development of disease, and new therapeutic strategies. Although -omic level single cell technologies are a relatively recent development that been used only in laboratory studies, these approaches might eventually be used in the clinic. We review the prospects of applying single cell genome, transcriptome, epigenome, proteome, and metabolome analyses to gastroenterology and hepatology research. Combining data from multi-omic platforms and rapid technological developments could lead to new diagnostic, prognostic, and therapeutic approaches.
2020-05-19Discovering New Lipidomic Features Using Cell Type Specific Fluorophore Expression to Provide Spatial and Biological Specificity in a Multimodal Workflow with MALDI Imaging Mass SpectrometryJones MA, Cho SH, Patterson NH, Van de Plas R, Spraggins JM, Boothby MR, Caprioli RM.TMC-Vanderbilt (Kidney)Identifying the spatial distributions of biomolecules in tissue is crucial for understanding integrated function. Imaging mass spectrometry (IMS) allows simultaneous mapping of thousands of biosynthetic products such as lipids but has needed a means of identifying specific cell-types or functional states to correlate with molecular localization. We report, here, advances starting from identity marking with a genetically encoded fluorophore. The fluorescence emission data were integrated with IMS data through multimodal image processing with advanced registration techniques and data-driven image fusion. In an unbiased analysis of spleens, this integrated technology enabled identification of ether lipid species preferentially enriched in germinal centers. We propose that this use of genetic marking for microanatomical regions of interest can be paired with molecular information from IMS for any tissue, cell-type, or activity state for which fluorescence is driven by a gene-tracking allele and ultimately with outputs of other means of spatial mapping.
2020-06-16Single-cell Lineage Tracing by Integrating CRISPR-Cas9 Mutations With Transcriptomic DataZafar H, Lin C, Bar-Joseph Z.HIVE TC-CMURecent studies combine two novel technologies, single-cell RNA-sequencing and CRISPR-Cas9 barcode editing for elucidating developmental lineages at the whole organism level. While these studies provided several insights, they face several computational challenges. First, lineages are reconstructed based on noisy and often saturated random mutation data. Additionally, due to the randomness of the mutations, lineages from multiple experiments cannot be combined to reconstruct a species-invariant lineage tree. To address these issues we developed a statistical method, LinTIMaT, which reconstructs cell lineages using a maximum-likelihood framework by integrating mutation and expression data. Our analysis shows that expression data helps resolve the ambiguities arising in when lineages are inferred based on mutations alone, while also enabling the integration of different individual lineages for the reconstruction of an invariant lineage tree. LinTIMaT lineages have better cell type coherence, improve the functional significance of gene sets and provide new insights on progenitors and differentiation pathways.
2020-06-30A Cancer Biologist's Primer on Machine Learning Applications in High-Dimensional CytometryKeyes TJ, Domizi P, Lo YC, Nolan GP, Davis KLTMC-StanfordThe application of machine learning and artificial intelligence to high-dimensional cytometry data sets has increasingly become a staple of bioinformatic data analysis over the past decade. This is especially true in the field of cancer biology, where protocols for collecting multiparameter single-cell data in a high-throughput fashion are rapidly developed. As the use of machine learning methodology in cytometry becomes increasingly common, there is a need for cancer biologists to understand the basic theory and applications of a variety of algorithmic tools for analyzing and interpreting cytometry data. We introduce the reader to several keystone machine learning-based analytic approaches with an emphasis on defining key terms and introducing a conceptual framework for making translational or clinically relevant discoveries. The target audience consists of cancer cell biologists and physician-scientists interested in applying these tools to their own data, but who may have limited training in bioinformatics. © 2020 International Society for Advancement of Cytometry.
2020-07-14An Integrated Multi-omic Single-Cell Atlas of Human B Cell Identity.Glass DR, Tsai AG, Oliveria JP, Hartmann FJ, Kimmey SC, Calderon AA, Borges L, Glass MC, Wagar LE, Davis MM, Bendall SC.RTI-StanfordB cells are capable of a wide range of effector functions including antibody secretion, antigen presentation, cytokine production, and generation of immunological memory. A consistent strategy for classifying human B cells by using surface molecules is essential to harness this functional diversity for clinical translation. We developed a highly multiplexed screen to quantify the co-expression of 351 surface molecules on millions of human B cells. We identified differentially expressed molecules and aligned their variance with isotype usage, VDJ sequence, metabolic profile, biosynthesis activity, and signaling response. Based on these analyses, we propose a classification scheme to segregate B cells from four lymphoid tissues into twelve unique subsets, including a CD45RB+CD27- early memory population, a class-switched CD39+ tonsil-resident population, and a CD19hiCD11c+ memory population that potently responds to immune activation. This classification framework and underlying datasets provide a resource for further investigations of human B cell identity and function.
2020-07-23Multimodal Analysis of Composition and Spatial Architecture in Human Squamous Cell CarcinomaJi AL, Rubin AJ, Thrane K, Jiang S, Reynolds DL, Meyers RM, Guo MG, George BM, Mollbrink A, Bergenstråhle J, Larsson L, Bai Y, Zhu B, Bhaduri A, Meyers JM, Rovira-Clavé X, Hollmig ST, Aasi SZ, Nolan GP, Lundeberg J, Khavari PATMC-StanfordTo define the cellular composition and architecture of cutaneous squamous cell carcinoma (cSCC), we combined single-cell RNA sequencing with spatial transcriptomics and multiplexed ion beam imaging from a series of human cSCCs and matched normal skin. cSCC exhibited four tumor subpopulations, three recapitulating normal epidermal states, and a tumor-specific keratinocyte (TSK) population unique to cancer, which localized to a fibrovascular niche. Integration of single-cell and spatial data mapped ligand-receptor networks to specific cell types, revealing TSK cells as a hub for intercellular communication. Multiple features of potential immunosuppression were observed, including T regulatory cell (Treg) co-localization with CD8 T cells in compartmentalized tumor stroma. Finally, single-cell characterization of human tumor xenografts and in vivo CRISPR screens identified essential roles for specific tumor subpopulation-enriched gene networks in tumorigenesis. These data define cSCC tumor and stromal cell subpopulations, the spatial niches where they interact, and the communicating gene networks that they engage in cancer.
2020-08-31Single-cell metabolic profiling of human cytotoxic T cellsHartmann FJ, Mrdjen D, McCaffrey E, Glass DR, Greenwald NF, Bharadwaj A, Khair Z, Verberk SGS, Baranski A, Baskar R, Graf W, Van Valen D, Van den Bossche J, Angelo M, Bendall SCRTI-StanfordCellular metabolism regulates immune cell activation, differentiation and effector functions, but current metabolic approaches lack single-cell resolution and simultaneous characterization of cellular phenotype. In this study, we developed an approach to characterize the metabolic regulome of single cells together with their phenotypic identity. The method, termed single-cell metabolic regulome profiling (scMEP), quantifies proteins that regulate metabolic pathway activity using high-dimensional antibody-based technologies. We employed mass cytometry (cytometry by time of flight, CyTOF) to benchmark scMEP against bulk metabolic assays by reconstructing the metabolic remodeling of in vitro-activated naive and memory CD8+ T cells. We applied the approach to clinical samples and identified tissue-restricted, metabolically repressed cytotoxic T cells in human colorectal carcinoma. Combining our method with multiplexed ion beam imaging by time of flight (MIBI-TOF), we uncovered the spatial organization of metabolic programs in human tissues, which indicated exclusion of metabolically repressed immune cells from the tumor-immune boundary. Overall, our approach enables robust approximation of metabolic and functional states in individual cells.
2020-08-31Single-cell metabolic profiling of human cytotoxic T cellsHartmann FJ, Mrdjen D, McCaffrey E, Glass DR, Greenwald NF, Bharadwaj A, Khair Z, Verberk SGS, Baranski A, Baskar R, Graf W, Van Valen D, Van den Bossche J, Angelo M, Bendall SC.RTI-StanfordCellular metabolism regulates immune cell activation, differentiation and effector functions, but current metabolic approaches lack single-cell resolution and simultaneous characterization of cellular phenotype. In this study, we developed an approach to characterize the metabolic regulome of single cells together with their phenotypic identity. The method, termed single-cell metabolic regulome profiling (scMEP), quantifies proteins that regulate metabolic pathway activity using high-dimensional antibody-based technologies. We employed mass cytometry (cytometry by time of flight, CyTOF) to benchmark scMEP against bulk metabolic assays by reconstructing the metabolic remodeling of in vitro-activated naive and memory CD8+ T cells. We applied the approach to clinical samples and identified tissue-restricted, metabolically repressed cytotoxic T cells in human colorectal carcinoma. Combining our method with multiplexed ion beam imaging by time of flight (MIBI-TOF), we uncovered the spatial organization of metabolic programs in human tissues, which indicated exclusion of metabolically repressed immune cells from the tumor-immune boundary. Overall, our approach enables robust approximation of metabolic and functional states in individual cells.
2020-09-01Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomesShafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE, Bosworth C, Armstrong J, Tigyi K, Maurer N, Koren S, Sedlazeck FJ, Marschall T, Mayes S, Costa V, Zook JM, Liu KJ, Kilburn D, Sorensen M, Munson KM, Vollger MR, Monlong J, Garrison E, Eichler EE, Salama S, Haussler D, Green RE, Akeson M, Phillippy A, Miga KH, Carnevali P, Jain M, Paten BHIVE TC-CMUDe novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed.
2020-09-04The impact of air transport availability on research collaboration: A case study of four universitiesPloszaj A, Yan X, Börner K.HIVE MC-IUThis paper analyzes the impact of air transport connectivity and accessibility on scientific collaboration. Numerous studies demonstrated that the likelihood of collaboration declines with increase in distance between potential collaborators. These works commonly use simple measures of physical distance rather than actual flight capacity and frequency. Our study addresses this limitation by focusing on the relationship between flight availability and the number of scientific co-publications. Furthermore, we distinguish two components of flight availability: (1) direct and indirect air connections between airports; and (2) distance to the nearest airport from cities and towns where authors of scientific articles have their professional affiliations. Based on Zero-inflated Negative Binomial Regression, we provide evidence that greater flight availability is associated with more frequent scientific collaboration. More flight connections (connectivity) and proximity of airport (accessibility) increase the expected number of coauthored scientific papers. Moreover, direct flights and flights with one transfer are more valuable for intensifying scientific cooperation than travels involving more connecting flights. Further, analysis of four organizational sub-datasets-Arizona State University, Indiana University Bloomington, Indiana University-Purdue University Indianapolis, and University of Michigan-shows that the relationship between airline transport availability and scientific collaboration is not uniform, but is associated with the research profile of an institution and the characteristics of the airport that serves this institution.
2020-10-09An Integrated Microfluidic Probe for Mass Spectrometry Imaging of Biological Samples*Li X, Yin R, Hu H, Li Y, Sun X, Dey SK, Laskin J.TTD-PurdueAmbient ionization based on liquid extraction is widely used in mass spectrometry imaging (MSI) of molecules in biological samples. The development of nanospray desorption electrospray ionization (nano-DESI) has enabled the robust imaging of tissue sections with high spatial resolution. However, the fabrication of the nano-DESI probe is challenging, which limits its dissemination to the broader scientific community. Herein, we describe the design and performance of an integrated microfluidic probe (iMFP) for nano-DESI MSI. The glass iMFP, fabricated using photolithography, wet etching, and polishing, shows comparable performance to the capillary-based nano-DESI MSI in terms of stability and sensitivity; a spatial resolution of better than 25 μm was obtained in these first proof-of-principle experiments. The iMFP is easy to operate and align in front of a mass spectrometer, which will facilitate broader use of liquid-extraction-based MSI in biological research, drug discovery, and clinical studies.
2020-10-19CDKL5: a promising new therapeutic target for acute kidney injury?de Caestecker MP.TMC-Vanderbilt (Kidney)Online ahead of print. No abstract available.
2020-10-27Iterative point set registration for aligning scRNA-seq dataAlavi A, Bar-Joseph ZHIVE TC-CMUSeveral studies profile similar single cell RNA-Seq (scRNA-Seq) data using different technologies and platforms. A number of alignment methods have been developed to enable the integration and comparison of scRNA-Seq data from such studies. While each performs well on some of the datasets, to date no method was able to both perform the alignment using the original expression space and generalize to new data. To enable such analysis we developed Single Cell Iterative Point set Registration (SCIPR) which extends methods that were successfully applied to align image data to scRNA-Seq. We discuss the required changes needed, the resulting optimization function, and algorithms for learning a transformation function for aligning data. We tested SCIPR on several scRNA-Seq datasets. As we show it successfully aligns data from several different cell types, improving upon prior methods proposed for this task. In addition, we show the parameters learned by SCIPR can be used to align data not used in the training and to identify key cell type-specific genes.
2020-11-01High-Parameter Immune Profiling with CyTOFSahaf B, Rahman A, Maecker HT, Bendall SCRTI-StanfordMass cytometry, or CyTOF, is a useful technology for high-parameter single-cell phenotyping, especially from suspension cells such as blood or PBMC. It is particularly appealing to monitor the systemic immune changes that could accompany cancer immunotherapy. Here we present a reference panel for identification of all major immune cell populations, with flexibility for addition of trial-specific markers. We also describe best-practice measures for minimizing and tracking batch variability. These include: sample barcoding, use of spiked-in reference cells, and lyophilization of the antibody cocktail. Our protocol assumes the use of cryopreserved PBMC, both for convenience of batching samples and for maximum comparability across patients and time points. Finally, we show an option for automated analysis using the Astrolabe platform (Astrolabe Diagnostics, Inc.).
2020-11-02Landscape of coordinated immune responses to H1N1 challenge in humans.Rahil Z, Leylek R, Schürch CM, Chen H, Bjornson-Hooper Z, Christensen SR, Gherardini PF, Bhate SS, Spitzer MH, Fragiadakis GK, Mukherjee N, Kim N, Jiang S, Yo J, Gaudilliere B, Affrime M, Bock B, Hensley SE, Idoyaga J, Aghaeepour N, Kim K, Nolan GP, McIlwain DR.TMC-StanfordInfluenza is a significant cause of morbidity and mortality worldwide. Here we show changes in the abundance and activation states of more than 50 immune cell subsets in 35 individuals over 11 time points during human A/California/2009 (H1N1) virus challenge monitored using mass cytometry along with other clinical assessments. Peak change in monocyte, B cell, and T cell subset frequencies coincided with peak virus shedding, followed by marked activation of T and NK cells. Results led to the identification of CD38 as a critical regulator of plasmacytoid dendritic cell function in response to influenza virus. Machine learning using study-derived clinical parameters and single-cell data effectively classified and predicted susceptibility to infection. The coordinated immune cell dynamics defined in this study provide a framework for identifying novel correlates of protection in the evaluation of future influenza therapeutics.
2020-11-13Guidelines for reporting single-cell RNA-seq experiments.Füllgrabe A, George N, Green M, Nejad P, Aronow B, Fexova SK, Fischer C, Freeberg MA, Huerta L, Morrison N, Scheuermann RH, Taylor D, Vasilevsky N, Clarke L, Gehlenborg N, Kent J, Marioni J, Teichmann S, Brazma A, Papatheodorou IHIVE TC-HarvardNo abstract available.
2020-12-01Effect of MALDI matrices on lipid analyses of biological tissues using MALDI-2 postionization mass spectrometryMcMillen JC, Fincher JA, Klein DR, Spraggins JM, Caprioli RMTMC-Vanderbilt (Kidney)Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) allows for highly multiplexed, untargeted detection of many hundreds of analytes from tissue. Recently, laser postionization (MALDI-2) has been developed for increased ion yield and sensitivity for lipid IMS. However, the dependence of MALDI-2 performance on the various lipid classes is largely unknown. To understand the effect of the applied matrix on MALDI-2 analysis of lipids, samples including an equimolar lipid standard mixture, various tissue homogenates, and intact rat kidney tissue sections were analyzed using the following matrices: α-cyano-4-hydroxycinnamic acid, 2',5'-dihydroxyacetophenone, 2',5'-dihydroxybenzoic acid (DHB), and norharmane (NOR). Lipid signal enhancement of protonated species using MALDI-2 technology varied based on the matrix used. Although signal improvements were observed for all matrices, the most dramatic effects using MALDI-2 were observed using NOR and DHB. For lipid standards analyzed by MALDI-2, NOR provided the broadest coverage, enabling the detection of all 13 protonated standards, including nonpolar lipids, whereas DHB gave less coverage but gave the highest signal increase for those lipids recorded. With respect to tissue homogenates and rat kidney tissue, mass spectra were compared and showed that the number and intensity of neutral lipids tentatively identified with MALDI-2 using NOR increased significantly (e.g., fivefold intensity increase for triacylglycerol). In the cases of DHB with MALDI-2, the number of protonated lipids identified from tissue homogenates doubled with 152 on average compared with 76 with MALDI alone. High spatial resolution imaging (~20 μm) of rat kidney tissue showed similar results using DHB with 125 lipids tentatively identified from MALDI-2 spectra versus just 72 using standard MALDI. From the four matrices tested, NOR provided the greatest increase in sensitivity for neutral lipids (triacylglycerol, diacylglycerol, monoacylglycerol, and cholesterol ester), and DHB provided the highest overall number of lipids detected using MALDI-2 technology.
2020-12-10GCNG: graph convolutional networks for inferring gene interaction from spatial transcriptomics dataYuan Y, Bar-Joseph ZHIVE TC-CMUMost methods for inferring gene-gene interactions from expression data focus on intracellular interactions. The availability of high-throughput spatial expression data opens the door to methods that can infer such interactions both within and between cells. To achieve this, we developed Graph Convolutional Neural networks for Genes (GCNG). GCNG encodes the spatial information as a graph and combines it with expression data using supervised training. GCNG improves upon prior methods used to analyze spatial transcriptomics data and can propose novel pairs of extracellular interacting genes. The output of GCNG can also be used for downstream analysis including functional gene assignment.Supporting website with software and data: .
2021-01-27Integrated spatial genomics reveals global architecture of single nucleiTakei Y, Yun J, Zheng S, Ollikainen N, Pierson N, White J, Shah S, Thomassie J, Suo S, Eng CL, Guttman M, Yuan GC, Cai L.TMC-Cal TechIdentifying the relationships between chromosome structures, nuclear bodies, chromatin states and gene expression is an overarching goal of nuclear-organization studies1-4. Because individual cells appear to be highly variable at all these levels5, it is essential to map different modalities in the same cells. Here we report the imaging of 3,660 chromosomal loci in single mouse embryonic stem (ES) cells using DNA seqFISH+, along with 17 chromatin marks and subnuclear structures by sequential immunofluorescence and the expression profile of 70 RNAs. Many loci were invariably associated with immunofluorescence marks in single mouse ES cells. These loci form 'fixed points' in the nuclear organizations of single cells and often appear on the surfaces of nuclear bodies and zones defined by combinatorial chromatin marks. Furthermore, highly expressed genes appear to be pre-positioned to active nuclear zones, independent of bursting dynamics in single cells. Our analysis also uncovered several distinct mouse ES cell subpopulations with characteristic combinatorial chromatin states. Using clonal analysis, we show that the global levels of some chromatin marks, such as H3 trimethylation at lysine 27 (H3K27me3) and macroH2A1 (mH2A1), are heritable over at least 3-4 generations, whereas other marks fluctuate on a faster time scale. This seqFISH+-based spatial multimodal approach can be used to explore nuclear organization and cell states in diverse biological systems.
2021-03-08Giotto: a toolbox for integrative analysis and visualization of spatial expression dataDries R, Zhu Q, Dong R, Eng CL, Li H, Liu K, Fu Y, Zhao T, Sarkar A, Bao F, George RE, Pierson N, Cai L, Yuan GC.TTD-Cal TechSpatial transcriptomic and proteomic technologies have provided new opportunities to investigate cells in their native microenvironment. Here we present Giotto, a comprehensive and open-source toolbox for spatial data analysis and visualization. The analysis module provides end-to-end analysis by implementing a wide range of algorithms for characterizing tissue composition, spatial expression patterns, and cellular interactions. Furthermore, single-cell RNAseq data can be integrated for spatial cell-type enrichment analysis. The visualization module allows users to interactively visualize analysis outputs and imaging features. To demonstrate its general applicability, we apply Giotto to a wide range of datasets encompassing diverse technologies and platforms.

Return to Home.
Collaboration Portal