HuBMAP Publications

There are 333 publications.
Publish DateTitleAbstractAuthor(s)HuBMAP Component
0023-06-01An integrated cell atlas of the lung in health and diseaseSikkema L, Ramírez-Suástegui C, Strobl DC, Gillett TE, Zappia L, Madissoon E, Markov NS, Zaragosi LE, Ji Y, Ansari M, Arguel MJ, Apperloo L, Banchero M, Bécavin C, Berg M, Chichelnitskiy E, Chung MI, Collin A, Gay ACA, Gote-Schniering J, Hooshiar Kashani B, Inecik K, Jain M, Kapellos TS, Kole TM, Leroy S, Mayr CH, Oliver AJ, von Papen M, Peter L, Taylor CJ, Walzthoeni T, Xu C, Bui LT, De Donno C, Dony L, Faiz A, Guo M, Gutierrez AJ, Heumos L, Huang N, Ibarra IL, Jackson ND, Kadur Lakshminarasimha Murthy P, Lotfollahi M, Tabib T, Talavera-López C, Travaglini KJ, Wilbrey-Clark A, Worlock KB, Yoshida M; Lung Biological Network Consortium; van den Berge M, Bossé Y, Desai TJ, Eickelberg O, Kaminski N, Krasnow MA, Lafyatis R, Nikolic MZ, Powell JE, Rajagopal J, Rojas M, Rozenblatt-Rosen O, Seibold MA, Sheppard D, Shepherd DP, Sin DD, Timens W, Tsankov AM, Whitsett J, Xu Y, Banovich NE, Barbry P, Duong TE, Falk CS, Meyer KB, Kropski JA, Pe'er D, Schiller HB, Tata PR, Schultze JL, Teichmann SA, Misharin AV, Nawijn MCTMC-URMCSingle-cell technologies have transformed our understanding of human tissues. Yet, studies typically capture only a limited number of donors and disagree on cell type definitions. Integrating many single-cell datasets can address these limitations of individual studies and capture the variability present in the population. Here we present the integrated Human Lung Cell Atlas (HLCA), combining 49 datasets of the human respiratory system into a single atlas spanning over 2.4 million cells from 486 individuals. The HLCA presents a consensus cell type re-annotation with matching marker genes, including annotations of rare and previously undescribed cell types. Leveraging the number and diversity of individuals in the HLCA, we identify gene modules that are associated with demographic covariates such as age, sex and body mass index, as well as gene modules changing expression along the proximal-to-distal axis of the bronchial tree. Mapping new data to the HLCA enables rapid data annotation and interpretation. Using the HLCA as a reference for the study of disease, we identify shared cell states across multiple lung diseases, including SPP1+ profibrotic monocyte-derived macrophages in COVID-19, pulmonary fibrosis and lung carcinoma. Overall, the HLCA serves as an example for the development and use of large-scale, cross-dataset organ atlases within the Human Cell Atlas.
2018-10-29Identification of spatially associated subpopulations by combining scRNAseq and sequential fluorescence in situ hybridization dataZhu Q, Shah S, Dries R, Cai L, Yuan GC.TTD-Cal TechHow intrinsic gene-regulatory networks interact with a cell's spatial environment to define its identity remains poorly understood. We developed an approach to distinguish between intrinsic and extrinsic effects on global gene expression by integrating analysis of sequencing-based and imaging-based single-cell transcriptomic profiles, using cross-platform cell type mapping combined with a hidden Markov random field model. We applied this approach to dissect the cell-type- and spatial-domain-associated heterogeneity in the mouse visual cortex region. Our analysis identified distinct spatially associated, cell-type-independent signatures in the glutamatergic and astrocyte cell compartments. Using these signatures to analyze single-cell RNA sequencing data, we identified previously unknown spatially associated subpopulations, which were validated by comparison with anatomical structures and Allen Brain Atlas images.
2018-12-11Forecasting innovations in science, technology, and educationBörner K, Rouse WB, Trunfio P, Stanley HE.HIVE MC-IUHuman survival depends on our ability to predict future outcomes so that we can make informed decisions. Human cognition and perception are optimized for local, short-term decision-making, such as deciding when to fight or flight, whom to mate, or what to eat. For more elaborate decisions (e.g., when to harvest, when to go to war or not, and whom to marry), people used to consult oracles—prophetic predictions of the future inspired by the gods. Over time, oracles were replaced by models of the structure and dynamics of natural, technological, and social systems. In the 21st century, computational models and visualizations of model results inform much of our decision-making: near real-time weather forecasts help us decide when to take an umbrella, plant, or harvest; where to ground airplanes; or when to evacuate inhabitants in the path of a hurricane, tornado, or flood. Long-term weather and climate forecasts predict a future with increasing torrential rains, stronger winds, and more frequent drought, landslides, and forest fires as well as rising sea levels, enabling decision makers to prepare for these changes by building dikes, moving cities and roads, and building larger water reservoirs and better storm sewers.
2018-12-19Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomicsStoeckius M, Zheng S, Houck-Loomis B, Hao S, Yeung BZ, Mauck WM, Smibert P, Satija R.HIVE MC-NYGCDespite rapid developments in single cell sequencing, sample-specific batch effects, detection of cell multiplets, and experimental costs remain outstanding challenges. Here, we introduce Cell Hashing, where oligo-tagged antibodies against ubiquitously expressed surface proteins uniquely label cells from distinct samples, which can be subsequently pooled. By sequencing these tags alongside the cellular transcriptome, we can assign each cell to its original sample, robustly identify cross-sample multiplets, and “super-load” commercial droplet-based systems for significant cost reduction. We validate our approach using a complementary genetic approach and demonstrate how hashing can generalize the benefits of single cell multiplexing to diverse samples and experimental designs.
2019-02-01Protein identification strategies in MALDI imaging mass spectrometry: a brief reviewRyan DJ, Spraggins JM, Caprioli RM.TMC-Vanderbilt (Kidney)Matrix assisted laser desorption/ionization (MALDI) imaging mass spectrometry (IMS) is a powerful technology used to investigate the spatial distributions of thousands of molecules throughout a tissue section from a single experiment. As proteins represent an important group of functional molecules in tissue and cells, the imaging of proteins has been an important point of focus in the development of IMS technologies and methods. Protein identification is crucial for the biological contextualization of molecular imaging data. However, gas-phase fragmentation efficiency of MALDI generated proteins presents significant challenges, making protein identification directly from tissue difficult. This review highlights methods and technologies specifically related to protein identification that have been developed to overcome these challenges in MALDI IMS experiments.
2019-02-05Data visualization literacy: Definitions, conceptual frameworks, exercises, and assessmentsBörner K, Bueckle A and Ginda M.HIVE MC-IUIn the information age, the ability to read and construct data visualizations becomes as important as the ability to read and write text. However, while standard definitions and theoretical frameworks to teach and assess textual, mathematical, and visual literacy exist, current data visualization literacy (DVL) definitions and frameworks are not comprehensive enough to guide the design of DVL teaching and assessment. This paper introduces a data visualization literacy framework (DVL-FW) that was specifically developed to define, teach, and assess DVL. The holistic DVL-FW promotes both the reading and construction of data visualizations, a pairing analogous to that of both reading and writing in textual literacy and understanding and applying in mathematical literacy. Specifically, the DVL-FW defines a hierarchical typology of core concepts and details the process steps that are required to extract insights from data. Advancing the state of the art, the DVL-FW interlinks theoretical and procedural knowledge and showcases how both can be combined to design curricula and assessment measures for DVL. Earlier versions of the DVL-FW have been used to teach DVL to more than 8,500 residential and online students, and results from this effort have helped revise and validate the DVL-FW presented here.
2019-02-15Dhaka: Variational Autoencoder for Unmasking Tumor Heterogeneity from Single Cell Genomic DataRashid S, Shah S, Bar-Joseph Z, Pandya R.HIVE TC-CMUMOTIVATION: Intra-tumor heterogeneity is one of the key confounding factors in deciphering tumor evolution. Malignant cells exhibit variations in their gene expression, copy numbers, and mutation even when originating from a single progenitor cell. Single cell sequencing of tumor cells has recently emerged as a viable option for unmasking the underlying tumor heterogeneity. However, extracting features from single cell genomic data in order to infer their evolutionary trajectory remains computationally challenging due to the extremely noisy and sparse nature of the data. RESULTS: Here we describe 'Dhaka', a variational autoencoder method which transforms single cell genomic data to a reduced dimension feature space that is more efficient in differentiating between (hidden) tumor subpopulations. Our method is general and can be applied to several different types of genomic data including copy number variation from scDNA-Seq and gene expression from scRNA-Seq experiments. We tested the method on synthetic and 6 single cell cancer datasets where the number of cells ranges from 250 to 6000 for each sample. Analysis of the resulting feature space revealed subpopulations of cells and their marker genes. The features are also able to infer the lineage and/or differentiation trajectory between cells greatly improving upon prior methods suggested for feature extraction and dimensionality reduction of such data. AVAILABILITY AND IMPLEMENTATION: All the datasets used in the paper are publicly available and developed software package and supporting info is available on Github https://github.com/MicrosoftGenomics/Dhaka.
2019-02-20The single-cell transcriptional landscape of mammalian organogenesisCao J, Spielmann M, Qiu X, Huang X, Ibrahim DM, Hill AJ, Zhang F, Mundlos S, Christiansen L, Steemers FJ, Trapnell C & Shendure JTMC-Cal TechMammalian organogenesis is a remarkable process. Within a short timeframe, the cells of the three germ layers transform into an embryo that includes most of the major internal and external organs. Here we investigate the transcriptional dynamics of mouse organogenesis at single-cell resolution. Using single-cell combinatorial indexing, we profiled the transcriptomes of around 2 million cells derived from 61 embryos staged between 9.5 and 13.5 days of gestation, in a single experiment. The resulting ‘mouse organogenesis cell atlas’ (MOCA) provides a global view of developmental processes during this critical window. We use Monocle 3 to identify hundreds of cell types and 56 trajectories, many of which are detected only because of the depth of cellular coverage, and collectively define thousands of corresponding marker genes. We explore the dynamics of gene expression within cell types and trajectories over time, including focused analyses of the apical ectodermal ridge, limb mesenchyme and skeletal muscle.
2019-03-01Multiple TOF/TOF Events in a Single Laser Shot for Multiplexed Lipid Identifications in MALDI Imaging Mass SpectrometryPrentice BM, McMillen JC, Caprioli RMTMC-Vanderbilt (Kidney)Tandem mass spectrometry (MS/MS) is often used to identify lipids in matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) workflows. The molecular specificity afforded by MS/MS is crucial on MALDI time-of-flight (TOF) platforms that generally lack high resolution accurate mass measurement capabilities. Unfortunately, imaging MS/MS workflows generally only monitor a single precursor ion over the imaged area, limiting the throughput of this methodology. Herein, we demonstrate that multiple TOF/TOF events performed in each laser shot can be used to improve the throughput of imaging MS/MS. This is shown to enable the simultaneous identification of multiple phosphatidylcholine lipids in rat brain tissue. Uniquely, the separation in time achieved for the precursor ions in the TOF-1 region of the instrument is maintained for the fragment ions as they are analyzed in TOF-2, allowing for the differentiation of fragment ions of the exact same m/z derived from different precursor ions (e.g., the m/z 163 fragment ion from precursor ion m/z 772.5 is easily distinguished from the m/z 163 fragment ion from precursor ion m/z 826.5). This multiplexed imaging MS/MS approach allows for the acquisition of complete fragment ion spectra for multiple precursor ions per laser shot.
2019-03-25Transcriptome-scale super-resolved imaging in tissues by RNA seqFISHEng CL, Lawson M, Zhu Q, Dries R, Koulena N, Takei Y, Yun J, Cronin C, Karp C, Yuan GC, Cai L.TMC-Cal TechImaging the transcriptome in situ with high accuracy has been a major challenge in single-cell biology, which is particularly hindered by the limits of optical resolution and the density of transcripts in single cells. Here we demonstrate an evolution of sequential fluorescence in situ hybridization (seqFISH+). We show that seqFISH+ can image mRNAs for 10,000 genes in single cells-with high accuracy and sub-diffraction-limit resolution-in the cortex, subventricular zone and olfactory bulb of mouse brain, using a standard confocal microscope. The transcriptome-level profiling of seqFISH+ allows unbiased identification of cell classes and their spatial organization in tissues. In addition, seqFISH+ reveals subcellular mRNA localization patterns in cells and ligand-receptor pairs across neighbouring cells. This technology demonstrates the ability to generate spatial cell atlases and to perform discovery-driven studies of biological processes in situ.
2019-03-29Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolutionRodriques SG, Stickels RR, Goeva A, Martin CA, Murray E, Vanderburg CR, Welch J, Chen LM, Chen F, Macosko EZRTI-BroadSpatial positions of cells in tissues strongly influence function, yet a high-throughput, genome-wide readout of gene expression with cellular resolution is lacking. We developed Slide-seq, a method for transferring RNA from tissue sections onto a surface covered in DNA-barcoded beads with known positions, allowing the locations of the RNA to be inferred by sequencing. Using Slide-seq, we localized cell types identified by single-cell RNA sequencing datasets within the cerebellum and hippocampus, characterized spatial gene expression patterns in the Purkinje layer of mouse cerebellum, and defined the temporal evolution of cell type-specific responses in a mouse model of traumatic brain injury. These studies highlight how Slide-seq provides a scalable method for obtaining spatially resolved gene expression data at resolutions comparable to the sizes of individual cells.
2019-04-06Imaging mass spectrometry enables molecular profiling of mouse and human pancreatic tissuePrentice BM, Hart NJ, Phillips N, Haliyur R, Judd A, Armandala R, Spraggins JM, Lowe CL, Boyd KL, Stein RW, Wright CV, Norris JL, Powers AC, Brissova M, Caprioli RM.TMC-Vanderbilt (Kidney)The molecular response and function of pancreatic islet cells during metabolic stress is a complex process. The anatomical location and small size of pancreatic islets coupled with current methodological limitations have prevented the achievement of a complete, coherent picture of the role that lipids and proteins play in cellular processes under normal conditions and in diseased states. Herein, we describe the development of untargeted tissue imaging mass spectrometry (IMS) technologies for the study of in situ protein and, more specifically, lipid distributions in murine and human pancreases.
2019-04-19The Importance of Clinical Tissue ImagingSpraggins JM, Schwamborn K, Heeren RMA, Eberlin LS.TMC-Vanderbilt (Kidney)Tissue imaging by mass spectrometry (MS) combines the sensitivity and molecular specificity of MS with the spatial fidelity of classical histology for analysis of metabolites, lipids and proteins in tissues (Fig. 1). MS-based imaging is label-free, untargeted, sensitive, and specific, thereby enabling application in both basic biomedical research and the clinical laboratory. While all tissue imaging experiments are conceptually similar in their ability to generate spatial molecular data; ionization, data collection, and purpose vary widely. Here, we highlight recent technical advances and efforts that are motivating translational applications of this emerging technology.
2019-05-01SABER amplifies FISH: enhanced multiplexed imaging of RNA and DNA in cells and tissuesKishi JY, Lapan SW, Beliveau BJ, West ER, Zhu A, Sasaki HM, Saka SK, Wang Y, Cepko CL, Yin P.TTD-HarvardFluorescence in situ hybridization (FISH) reveals the abundance and positioning of nucleic acid sequences in fixed samples. Despite recent advances in multiplexed amplification of FISH signals, it remains challenging to achieve high levels of simultaneous amplification and sequential detection with high sampling efficiency and simple workflows. Here we introduce signal amplification by exchange reaction (SABER), which endows oligonucleotide-based FISH probes with long, single-stranded DNA concatemers that aggregate a multitude of short complementary fluorescent imager strands. We show that SABER amplified RNA and DNA FISH signals (5- to 450-fold) in fixed cells and tissues. We also applied 17 orthogonal amplifiers against chromosomal targets simultaneously and detected mRNAs with high efficiency. We then used 10-plex SABER-FISH to identify in vivo introduced enhancers with cell-type-specific activity in the mouse retina. SABER represents a simple and versatile molecular toolkit for rapid and cost-effective multiplexed imaging of nucleic acid targets.
2019-05-06Visualizing learner engagement, performance, and trajectories to evaluate and optimize online course designGinda M, Richey MC, Cousino M, Börner K.HIVE MC-IULearning analytics and visualizations make it possible to examine and communicate learners’ engagement, performance, and trajectories in online courses to evaluate and optimize course design for learners. This is particularly valuable for workforce training involving employees who need to acquire new knowledge in the most effective manner. This paper introduces a set of metrics and visualizations that aim to capture key dynamical aspects of learner engagement, performance, and course trajectories. The metrics are applied to identify prototypical behavior and learning pathways through and interactions with course content, activities, and assessments. The approach is exemplified and empirically validated using more than 30 million separate logged events that capture activities of 1,608 Boeing engineers taking the MITxPro Course, “Architecture of Complex Systems,” delivered in Fall 2016. Visualization results show course structure and patterns of learner interactions with course material, activities, and assessments. Tree visualizations are used to represent course hierarchical structures and explicit sequence of content modules. Learner trajectory networks represent pathways and interactions of individual learners through course modules, revealing patterns of learner engagement, content access strategies, and performance. Results provide evidence for instructors and course designers for evaluating the usage and effectiveness of course materials and intervention strategies.
2019-06-04Cell lineage inference from SNP and scRNA-Seq dataDing J, Lin C, Bar-Joseph Z.HIVE TC-CMUSeveral recent studies focus on the inference of developmental and response trajectories from single cell RNA-Seq (scRNA-Seq) data. A number of computational methods, often referred to as pseudo-time ordering, have been developed for this task. Recently, CRISPR has also been used to reconstruct lineage trees by inserting random mutations. However, both approaches suffer from drawbacks that limit their use. Here, we develop a method to detect significant, cell type specific, sequence mutations from scRNA-Seq data. We show that only a few mutations are enough for reconstructing good branching models. Integrating these mutations with expression data further improves the accuracy of the reconstructed models. As we show, the majority of mutations we identify are likely RNA editing events indicating that such information can be used to distinguish cell types.
2019-06-06Comprehensive Integration of Single-Cell DataStuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM 3rd, Hao Y, Stoeckius M, Smibert P, Satija R.HIVE MC-NYGCSingle-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to “anchor” diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.
2019-06-07The human body at cellular resolution: the NIH Human Biomolecular Atlas ProgramHuBMAP ConsortiumConsortiumTransformative technologies are enabling the construction of three-dimensional maps of tissues with unprecedented spatial and molecular resolution. Over the next seven years, the NIH Common Fund Human Biomolecular Atlas Program (HuBMAP) intends to develop a widely accessible framework for comprehensively mapping the human body at single-cell resolution by supporting technology development, data acquisition, and detailed spatial mapping. HuBMAP will integrate its efforts with other funding agencies, programs, consortia, and the biomedical research community at large towards the shared vision of a comprehensive, accessible three-dimensional molecular and cellular atlas of the human body, in health and under various disease conditions.
2019-06-13Two Specific Sulfatide Species Are Dysregulated during Renal Development in a Mouse Model of Alport SyndromeGessel MM, Spraggins JM, Voziyan PA, Abrahamson DR, Caprioli RM, Hudson BG.TMC-Vanderbilt (Kidney)Alport syndrome is caused by mutations in collagen IV that alter the morphology of renal glomerular basement membrane. Mutations result in proteinuria, tubulointerstitial fibrosis, and renal failure but the pathogenic mechanisms are not fully understood. Using imaging mass spectrometry, we aimed to determine whether the spatial and/or temporal patterns of renal lipids are perturbed during the development of Alport syndrome in the mouse model. Our results show that most sulfatides are present at similar levels in both the wild-type (WT) and the Alport kidneys, with the exception of two specific sulfatide species, SulfoHex-Cer(d18:2/24:0) and SulfoHex-Cer(d18:2/16:0). In the Alport but not in WT kidneys, the levels of these species mirror the previously described abnormal laminin expression in Alport syndrome. The presence of these sulfatides in renal tubules but not in glomeruli suggests that this specific aberrant lipid pattern may be related to the development of tubulointerstitial fibrosis in Alport disease.
2019-06-18MicroLESA: Integrating Autofluorescence Microscopy, In Situ Micro-Digestions, and Liquid Extraction Surface Analysis for High Spatial Resolution Targeted Proteomic Studies.Ryan DJ, Patterson NH, Putnam NE, Wilde AD, Weiss A, Perry WJ, Cassat JE, Skaar EP, Caprioli RM, Spraggins JM.TMC-Vanderbilt (Kidney)The ability to target discrete features within tissue using liquid surface extractions enables the identification of proteins while maintaining the spatial integrity of the sample. Here, we present a liquid extraction surface analysis (LESA) workflow, termed microLESA, that allows proteomic profiling from discrete tissue features of ∼110 μm in diameter by integrating nondestructive autofluorescence microscopy and spatially targeted liquid droplet micro-digestion. Autofluorescence microscopy provides the visualization of tissue foci without the need for chemical stains or the use of serial tissue sections. Tryptic peptides are generated from tissue foci by applying small volume droplets (∼250 pL) of enzyme onto the surface prior to LESA. The microLESA workflow reduced the diameter of the sampled area almost 5-fold compared to previous LESA approaches. Experimental parameters, such as tissue thickness, trypsin concentration, and enzyme incubation duration, were tested to maximize proteomics analysis. The microLESA workflow was applied to the study of fluorescently labeled Staphylococcus aureus infected murine kidney to identify unique proteins related to host defense and bacterial pathogenesis. Proteins related to nutritional immunity and host immune response were identified by performing microLESA at the infectious foci and surrounding abscess. These identifications were then used to annotate specific proteins observed in infected kidney tissue by MALDI FT-ICR IMS through accurate mass matching.
2019-06-19The 2019 mathematical oncology roadmapRockne RC, Hawkins-Daarud A, Swanson KR, Sluka JP, Glazier JA, Macklin P, Hormuth DA, Jarrett AM, Lima EABF, Tinsley Oden J, Biros G, Yankeelov TE, Curtius K, Al Bakir I, Wodarz D, Komarova N, Aparicio L, Bordyuh M, Rabadan R, Finley SD, Enderling H, Caudell J, et al.HIVE MC-IUWhether the nom de guerre is Mathematical Oncology, Computational or Systems Biology, Theoretical Biology, Evolutionary Oncology, Bioinformatics, or simply Basic Science, there is no denying that mathematics continues to play an increasingly prominent role in cancer research. Mathematical Oncology—defined here simply as the use of mathematics in cancer research—complements and overlaps with a number of other fields that rely on mathematics as a core methodology. As a result, Mathematical Oncology has a broad scope, ranging from theoretical studies to clinical trials designed with mathematical models. This Roadmap differentiates Mathematical Oncology from related fields and demonstrates specific areas of focus within this unique field of research. The dominant theme of this Roadmap is the personalization of medicine through mathematics, modelling, and simulation. This is achieved through the use of patient-specific clinical data to: develop individualized screening strategies to detect cancer earlier; make predictions of response to therapy; design adaptive, patient-specific treatment plans to overcome therapy resistance; and establish domain-specific standards to share model predictions and to make models and simulations reproducible. The cover art for this Roadmap was chosen as an apt metaphor for the beautiful, strange, and evolving relationship between mathematics and cancer.
2019-06-27A single-nucleus RNA-sequencing pipeline to decipher the molecular anatomy and pathophysiology of human kidneysLake BB, Chen S, Hoshi M, Plongthongkum N, Salamon D, Knoten A, Vijayan A, Venkatesh R, Kim EH, Gao D, Gaut J, Zhang K, Jain STMC-UCSD
Defining cellular and molecular identities within the kidney is necessary to understand its organization and function in health and disease. Here we demonstrate a reproducible method with minimal artifacts for single-nucleus Droplet-based RNA sequencing (snDrop-Seq) that we use to resolve thirty distinct cell populations in human adult kidney. We define molecular transition states along more than ten nephron segments spanning two major kidney regions. We further delineate cell type-specific expression of genes associated with chronic kidney disease, diabetes and hypertension, providing insight into possible targeted therapies. This includes expression of a hypertension-associated mechano-sensory ion channel in mesangial cells, and identification of proximal tubule cell populations defined by pathogenic expression signatures. Our fully optimized, quality-controlled transcriptomic profiling pipeline constitutes a tool for the generation of healthy and diseased molecular atlases applicable to clinical samples.
2019-08-01A recommended and verified procedure for in situ tryptic digestion of formalin-fixed paraffin-embedded tissues for analysis by matrix-assisted laser desorption/ionization imaging mass spectrometryJudd AM, Gutierrez DB, Moore JL, Patterson NH, Yang J, Romer CE, Norris JL, Caprioli RMTMC-Vanderbilt (Kidney)Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) is a molecular imaging technology uniquely capable of untargeted measurement of proteins, lipids, and metabolites while retaining spatial information about their location in situ. This powerful combination of capabilities has the potential to bring a wealth of knowledge to the field of molecular histology. Translation of this innovative research tool into clinical laboratories requires the development of reliable sample preparation protocols for the analysis of proteins from formalin-fixed paraffin-embedded (FFPE) tissues, the standard preservation process in clinical pathology. Although ideal for stained tissue analysis by microscopy, the FFPE process cross-links, disrupts, or can remove proteins from the tissue, making analysis of the protein content challenging. To date, reported approaches differ widely in process and efficacy. This tutorial presents a strategy derived from systematic testing and optimization of key parameters, for reproducible in situ tryptic digestion of proteins in FFPE tissue and subsequent MALDI IMS analysis. The approach describes a generalized method for FFPE tissues originating from virtually any source.
2019-08-19Immuno-SABER enables highly multiplexed and amplified protein imaging in tissuesSaka SK, Wang Y, Kishi JY, Zhu A, Zeng Y, Xie W, Kirli K, Yapp C, Cicconet M, Beliveau BJ, Lapan SW, Yin S, Lin M, Boyden ES, Kaeser PS, Pihan G, Church GM, Yin P.TTD-HarvardSpatial mapping of proteins in tissues is hindered by limitations in multiplexing, sensitivity and throughput. Here we report immunostaining with signal amplification by exchange reaction (Immuno-SABER), which achieves highly multiplexed signal amplification via DNA-barcoded antibodies and orthogonal DNA concatemers generated by primer exchange reaction (PER). SABER offers independently programmable signal amplification without in situ enzymatic reactions, and intrinsic scalability to rapidly amplify and visualize a large number of targets when combined with fast exchange cycles of fluorescent imager strands. We demonstrate 5- to 180-fold signal amplification in diverse samples (cultured cells, cryosections, formalin-fixed paraffin-embedded sections and whole-mount tissues), as well as simultaneous signal amplification for ten different proteins using standard equipment and workflows. We also combined SABER with expansion microscopy to enable rapid, multiplexed super-resolution tissue imaging. Immuno-SABER presents an effective and accessible platform for multiplexed and amplified imaging of proteins with high sensitivity and throughput.
2019-09-02A pooled single-cell genetic screen identifies regulatory checkpoints in the continuum of the epithelial-to-mesenchymal transitionMcFaline-Figueroa JL, Hill AJ, Qiu X, Jackson D, Shendure J, Trapnell C.TMC-Cal TechIntegrating single-cell trajectory analysis with pooled genetic screening could reveal the genetic architecture that guides cellular decisions in development and disease. We applied this paradigm to probe the genetic circuitry that controls epithelial-to-mesenchymal transition (EMT). We used single-cell RNA sequencing to profile epithelial cells undergoing a spontaneous spatially determined EMT in the presence or absence of transforming growth factor-β. Pseudospatial trajectory analysis identified continuous waves of gene regulation as opposed to discrete ‘partial’ stages of EMT. KRAS was connected to the exit from the epithelial state and the acquisition of a fully mesenchymal phenotype. A pooled single-cell CRISPR-Cas9 screen identified EMT-associated receptors and transcription factors, including regulators of KRAS, whose loss impeded progress along the EMT. Inhibiting the KRAS effector MEK and its upstream activators EGFR and MET demonstrates that interruption of key signaling events reveals regulatory ‘checkpoints’ in the EMT continuum that mimic discrete stages, and reconciles opposing views of the program that controls EMT.
2019-09-10Supervised classification enables rapid annotation of cell atlasesPliner HA, Shendure J, Trapnell C.TMC-Cal TechSingle-cell molecular profiling technologies are gaining rapid traction, but the manual process by which resulting cell types are typically annotated is labor intensive and rate-limiting. We describe Garnett, a tool for rapidly annotating cell types in single-cell transcriptional profiling and single-cell chromatin accessibility datasets, based on an interpretable, hierarchical markup language of cell type-specific genes. Garnett successfully classifies cell types in tissue and whole organism datasets, as well as across species.
2019-10-07GiniClust3: a fast and memory-efficient tool for rare cell type identificationDong R, Yuan GC.TTD-Cal TechBACKGROUND: With the rapid development of single-cell RNA sequencing technology, it is possible to dissect cell-type composition at high resolution. A number of methods have been developed with the purpose to identify rare cell types. However, existing methods are still not scalable to large datasets, limiting their utility. To overcome this limitation, we present a new software package, called GiniClust3, which is an extension of GiniClust2 and significantly faster and memory-efficient than previous versions. RESULTS: Using GiniClust3, it only takes about 7 h to identify both common and rare cell clusters from a dataset that contains more than one million cells. Cell type mapping and perturbation analyses show that GiniClust3 could robustly identify cell clusters. CONCLUSIONS: Taken together, these results suggest that GiniClust3 is a powerful tool to identify both common and rare cell population and can handle large dataset. GiniCluster3 is implemented in the open-source python package and available at https://github.com/rdong08/GiniClust3.
2019-10-08High-Performance Molecular Imaging with MALDI Trapped Ion-Mobility Time-of-Flight (timsTOF) Mass SpectrometrySpraggins JM, Djambazova KV, Rivera ES, Migas LG, Neumann EK, Fuetterer A, Suetering J, Goedecke N, Ly A, Van de Plas R, Caprioli RM.TMC-Vanderbilt (Kidney)Understanding the genetic and molecular drivers of phenotypic heterogeneity across individuals is central to biology. As new technologies enable fine-grained and spatially resolved molecular profiling, we need new computational approaches to integrate data from the same organ across different individuals into a consistent reference and to construct maps of molecular and cellular organization at histological and anatomical scales. Here, we review previous efforts and discuss challenges involved in establishing such a common coordinate framework, the underlying map of tissues and organs. We focus on strategies to handle anatomical variation across individuals and highlight the need for new technologies and analytical methods spanning multiple hierarchical scales of spatial resolution.
2019-10-14High-throughput sequencing of the transcriptome and chromatin accessibility in the same cellChen S, Lake BB, Zhang K.TMC-UCSDSingle-cell RNA sequencing can reveal the transcriptional state of cells, yet provides little insight into the upstream regulatory landscape associated with open or accessible chromatin regions. Joint profiling of accessible chromatin and RNA within the same cells would permit direct matching of transcriptional regulation to its outputs. Here, we describe droplet-based single-nucleus chromatin accessibility and mRNA expression sequencing (SNARE-seq), a method that can link a cell’s transcriptome with its accessible chromatin for sequencing at scale. Specifically, accessible sites are captured by Tn5 transposase in permeabilized nuclei to permit, within many droplets in parallel, DNA barcode tagging together with the mRNA molecules from the same cells. To demonstrate the utility of SNARE-seq, we generated joint profiles of 5,081 and 10,309 cells from neonatal and adult mouse cerebral cortices, respectively. We reconstructed the transcriptome and epigenetic landscapes of major and rare cell types, uncovered lineage-specific accessible sites, especially for low-abundance cells, and connected the dynamics of promoter accessibility with transcription level during neurogenesis.
2019-10-29Staphylococcus aureus exhibits heterogeneous siderophore production within the vertebrate hostPerry WJ, Spraggins JM, Sheldon JR, Grunenwald CM, Heinrichs DE, Cassat JE, Skaar EP, Caprioli RMTMC-Vanderbilt (Kidney)Siderophores, iron-scavenging small molecules, are fundamental to bacterial nutrient metal acquisition and enable pathogens to overcome challenges imposed by nutritional immunity. Multimodal imaging mass spectrometry allows visualization of host-pathogen iron competition, by mapping siderophores within infected tissue. We have observed heterogeneous distributions of Staphylococcus aureus siderophores across infectious foci, challenging the paradigm that the vertebrate host is a uniformly iron-depleted environment to invading microbes.
2019-11-13High spatial resolution imaging of biological tissues using nanospray desorption electrospray ionization mass spectrometryYin R, Burnum-Johnson KE, Sun X, Dey SK & Laskin JTTD-Purdue
2019-11-15Continuous State HMMs for Modeling Time Series Single Cell RNA-Seq DataLin C, Bar-Joseph Z.HIVE TC-CMUMOTIVATION: Methods for reconstructing developmental trajectories from time series single cell RNA-Seq (scRNA-Seq) data can be largely divided into two categories. The first, often referred to as pseudotime ordering methods, are deterministic and rely on dimensionality reduction followed by an ordering step. The second learns a probabilistic branching model to represent the developmental process. While both types have been successful, each suffers from shortcomings that can impact their accuracy. RESULTS: We developed a new method based on continuous state HMMs (CSHMMs) for representing and modeling time series scRNA-Seq data. We define the CSHMM model and provide efficient learning and inference algorithms which allow the method to determine both the structure of the branching process and the assignment of cells to these branches. Analyzing several developmental single cell datasets we show that the CSHMM method accurately infers branching topology and correctly and continuously assign cells to paths, improving upon prior methods proposed for this task. Analysis of genes based on the continuous cell assignment identifies known and novel markers for different cell types. AVAILABILITY: Software and Supporting website: www.andrew.cmu.edu/user/chiehl1/CSHMM/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
2019-12-12Toward a Common Coordinate Framework for the Human BodyRood JE, Stuart T, Ghazanfar S, Biancalani T, Fisher E, Butler A, Hupalowska A, Gaffney L, Mauck W, Eraslan G, Marioni JC, Regev A, Satija R.HIVE MC-NYGCUnderstanding the genetic and molecular drivers of phenotypic heterogeneity across individuals is central to biology. As new technologies enable fine-grained and spatially resolved molecular profiling, we need new computational approaches to integrate data from the same organ across different individuals into a consistent reference and to construct maps of molecular and cellular organization at histological and anatomical scales. Here, we review previous efforts and discuss challenges involved in establishing such a common coordinate framework, the underlying map of tissues and organs. We focus on strategies to handle anatomical variation across individuals and highlight the need for new technologies and analytical methods spanning multiple hierarchical scales of spatial resolution.
2019-12-20Uncovering matrix effects on lipid analyses in MALDI imaging mass spectrometry experimentsPerry WJ, Patterson NH, Prentice BM, Neumann EK, Caprioli RM, Spraggins JM.TMC-Vanderbilt (Kidney)The specific matrix used in matrix‐assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) can have an effect on the molecules ionized from a tissue sample. The sensitivity for distinct classes of biomolecules can vary when employing different MALDI matrices. Here, we compare the intensities of various lipid subclasses measured by Fourier transform ion cyclotron resonance (FT‐ICR) IMS of murine liver tissue when using 9‐aminoacridine (9AA), 5‐chloro‐2‐mercaptobenzothiazole (CMBT), 1,5‐diaminonaphthalene (DAN), 2,5‐Dihydroxyacetophenone (DHA), and 2,5‐dihydroxybenzoic acid (DHB). Principal component analysis and receiver operating characteristic curve analysis revealed significant matrix effects on the relative signal intensities observed for different lipid subclasses and adducts. Comparison of spectral profiles and quantitative assessment of the number and intensity of species from each lipid subclass showed that each matrix produces unique lipid signals. In positive ion mode, matrix application methods played a role in the MALDI analysis for different cationic species. Comparisons of different methods for the application of DHA showed a significant increase in the intensity of sodiated and potassiated analytes when using an aerosol sprayer. In negative ion mode, lipid profiles generated using DAN were significantly different than all other matrices tested. This difference was found to be driven by modification of phosphatidylcholines during ionization that enables them to be detected in negative ion mode. These modified phosphatidylcholines are isomeric with common phosphatidylethanolamines confounding MALDI IMS analysis when using DAN. These results show an experimental basis of MALDI analyses when analyzing lipids from tissue and allow for more informed selection of MALDI matrices when performing lipid IMS experiments.
2019-12-23Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regressionHafemeister C, Satija R.HIVE MC-NYGCSingle-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from “regularized negative binomial regression,” where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.
2019-12-26Deep learning for inferring gene relationships from single-cell expression dataYuan Y, Bar-Joseph Z.HIVE TC-CMUSeveral methods were developed to mine gene–gene relationships from expression data. Examples include correlation and mutual information methods for coexpression analysis, clustering and undirected graphical models for functional assignments, and directed graphical models for pathway reconstruction. Using an encoding for gene expression data, followed by deep neural networks analysis, we present a framework that can successfully address all of these diverse tasks. We show that our method, convolutional neural network for coexpression (CNNC), improves upon prior methods in tasks ranging from predicting transcription factor targets to identifying disease-related genes to causality inference. CNNC’s encoding provides insights about some of the decisions it makes and their biological basis. CNNC is flexible and can easily be extended to integrate additional types of genomics data, leading to further improvements in its performance.
2020-01-07Automated mass spectrometry imaging of over 2000 proteins from tissue sections at 100-μm spatial resolutionPiehowski PD, Zhu Y, Bramer LM, Stratton KG, Zhao R, Orton DJ, Moore RJ, Yuan J, Mitchell HD, Gao Y, Webb-Robertson BM, Dey SK, Kelly RT, Burnum-Johnson KE.TTD-Purdue
2020-02-18Inferring TF activation order in time series scRNA-Seq studiesLin C, Ding J, Bar-Joseph Z.HIVE TC-CMUMethods for the analysis of time series single cell expression data (scRNA-Seq) either do not utilize information about transcription factors (TFs) and their targets or only study these as a post-processing step. Using such information can both, improve the accuracy of the reconstructed model and cell assignments, while at the same time provide information on how and when the process is regulated. We developed the Continuous-State Hidden Markov Models TF (CSHMM-TF) method which integrates probabilistic modeling of scRNA-Seq data with the ability to assign TFs to specific activation points in the model. TFs are assumed to influence the emission probabilities for cells assigned to later time points allowing us to identify not just the TFs controlling each path but also their order of activation. We tested CSHMM-TF on several mouse and human datasets. As we show, the method was able to identify known and novel TFs for all processes, assigned time of activation agrees with both expression information and prior knowledge and combinatorial predictions are supported by known interactions. We also show that CSHMM-TF improves upon prior methods that do not utilize TF-gene interaction
2020-02-28Immune monitoring using mass cytometry and related high-dimensional imaging approachesHartmann FJ, Bendall SC.RTI-StanfordThe cellular complexity and functional diversity of the human immune system necessitate the use of high-dimensional single-cell tools to uncover its role in multifaceted diseases such as rheumatic diseases, as well as other autoimmune and inflammatory disorders. Proteomic technologies that use elemental (heavy metal) reporter ions, such as mass cytometry (also known as CyTOF) and analogous high-dimensional imaging approaches (including multiplexed ion beam imaging (MIBI) and imaging mass cytometry (IMC)), have been developed from their low-dimensional counterparts, flow cytometry and immunohistochemistry, to meet this need. A growing number of studies have been published that use these technologies to identify functional biomarkers and therapeutic targets in rheumatic diseases, but the full potential of their application to rheumatic disease research has yet to be fulfilled. This Review introduces the underlying technologies for high-dimensional immune monitoring and discusses aspects necessary for their successful implementation, including study design principles, analytical tools and future developments for the field of rheumatology.
2020-03-11Multiplexed single-cell morphometry for hematopathology diagnosticsTsai AG, Glass DR, Juntilla M, Hartmann FJ, Oak JS, Fernandez-Pol S, Ohgami RS, Bendall SC.RTI-StanfordThe diagnosis of lymphomas and leukemias requires hematopathologists to integrate microscopically visible cellular morphology with antibody-identified cell surface molecule expression. To merge these into one high-throughput, highly multiplexed, single-cell assay, we quantify cell morphological features by their underlying, antibody-measurable molecular components, which empowers mass cytometers to ‘see’ like pathologists. When applied to 71 diverse clinical samples, single-cell morphometric profiling reveals robust and distinct patterns of ‘morphometric’ markers for each major cell type. Individually, lamin B1 highlights acute leukemias, lamin A/C helps distinguish normal from neoplastic mature T cells, and VAMP-7 recapitulates light-cytometric side scatter. Combined with machine learning, morphometric markers form intuitive visualizations of normal and neoplastic cellular distribution and differentiation. When recalibrated for myelomonocytic blast enumeration, this approach is superior to flow cytometry and comparable to expert microscopy, bypassing years of specialized training. The contextualization of traditional surface markers on independent morphometric frameworks permits more sensitive and automated diagnosis of complex hematopoietic diseases.
2020-03-13Considerations for Using the Vasculature as a Coordinate System to Map All the Cells in the Human BodyWeber, GM, Ju, Y, Börner K.HIVE MC-IUSeveral ongoing international efforts are developing methods of localizing single cells within organs or mapping the entire human body at the single cell level, including the Chan Zuckerberg Initiative’s Human Cell Atlas (HCA), and the Knut and Allice Wallenberg Foundation’s Human Protein Atlas (HPA), and the National Institutes of Health’s Human BioMolecular Atlas Program (HuBMAP). Their goals are to understand cell specialization, interactions, spatial organization in their natural context, and ultimately the function of every cell within the body. In the same way that the Human Genome Project had to assemble sequence data from different people to construct a complete sequence, multiple centers around the world are collecting tissue specimens from diverse populations that vary in age, race, sex, and body size. A challenge will be combining these heterogeneous tissue samples into a 3D reference map that will enable multiscale, multidimensional Google Maps-like exploration of the human body. Key to making alignment of tissue samples work is identifying and using a coordinate system called a Common Coordinate Framework (CCF), which defines the positions, or “addresses,” in a reference body, from whole organs down to functional tissue units and individual cells. In this perspective, we examine the concept of a CCF based on the vasculature and describe why it would be an attractive choice for mapping the human body.
2020-03-27Tools for the analysis of high-dimensional single-cell RNA sequencing dataWu Y, Zhang K.TMC-UCSDBreakthroughs in the development of high-throughput technologies for profiling transcriptomes at the single-cell level have helped biologists to understand the heterogeneity of cell populations, disease states and developmental lineages. However, these single-cell RNA sequencing (scRNA-seq) technologies generate an extraordinary amount of data, which creates analysis and interpretation challenges. Additionally, scRNA-seq datasets often contain technical sources of noise owing to incomplete RNA capture, PCR amplification biases and/or batch effects specific to the patient or sample. If not addressed, this technical noise can bias the analysis and interpretation of the data. In response to these challenges, a suite of computational tools has been developed to process, analyse and visualize scRNA-seq datasets. Although the specific steps of any given scRNA-seq analysis might differ depending on the biological questions being asked, a core workflow is used in most analyses. Typically, raw sequencing reads are processed into a gene expression matrix that is then normalized and scaled to remove technical noise. Next, cells are grouped according to similarities in their patterns of gene expression, which can be summarized in two or three dimensions for visualization on a scatterplot. These data can then be further analysed to provide an in-depth view of the cell types or developmental trajectories in the sample of interest.
2020-04-01Integrated molecular imaging technologies for investigation of metals in biological systems: A brief reviewPerry WJ, Weiss A, Van de Plas R, Spraggins JM, Caprioli RM, Skaar EP.TMC-Vanderbilt (Kidney)Metals play an essential role in biological systems and are required as structural or catalytic co-factors in many proteins. Disruption of the homeostatic control and/or spatial distributions of metals can lead to disease. Imaging technologies have been developed to visualize elemental distributions across a biological sample. Measurement of elemental distributions by imaging mass spectrometry and imaging X-ray fluorescence are increasingly employed with technologies that can assess histological features and molecular compositions. Data from several modalities can be interrogated as multimodal images to correlate morphological, elemental, and molecular properties. Elemental and molecular distributions have also been axially resolved to achieve three-dimensional volumes, dramatically increasing the biological information. In this review, we provide an overview of recent developments in the field of metal imaging with an emphasis on multimodal studies in two and three dimensions. We specifically highlight studies that present technological advancements and biological applications of how metal homeostasis affects human health.
2020-04-02Reconstructed Single-Cell Fate Trajectories Define Lineage Plasticity Windows during Differentiation of Human PSC-Derived Distal Lung ProgenitorsHurley K, Ding J, Villacorta-Martin C, Herriges MJ, Jacob A, Vedaie M, Alysandratos KD, Sun YL, Lin C, Werder RB, Huang J, Wilson AA, Mithal A, Mostoslavsky G, Oglesby I, Caballero IS, Guttentag SH, Ahangari F, Kaminski N, Rodriguez-Fraticelli A, Camargo F, Bar-Joseph Z, Kotton DN.HIVE TC-CMUAlveolar epithelial type 2 cells (AEC2s) are the facultative progenitors responsible for maintaining lung alveoli throughout life but are difficult to isolate from patients. Here, we engineer AEC2s from human pluripotent stem cells (PSCs) in vitro and use time-series single-cell RNA sequencing with lentiviral barcoding to profile the kinetics of their differentiation in comparison to primary fetal and adult AEC2 benchmarks. We observe bifurcating cell-fate trajectories as primordial lung progenitors differentiate in vitro, with some progeny reaching their AEC2 fate target, while others diverge to alternative non-lung endodermal fates. We develop a Continuous State Hidden Markov model to identify the timing and type of signals, such as overexuberant Wnt responses, that induce some early multipotent NKX2-1+ progenitors to lose lung fate. Finally, we find that this initial developmental plasticity is regulatable and subsides over time, ultimately resulting in PSC-derived AEC2s that exhibit a stable phenotype and nearly limitless self-renewal capacity.
2020-05-01Unsupervised machine learning for exploratory data analysis in imaging mass spectrometryVerbeeck N, Caprioli RM, Van de Plas R.TMC-Vanderbilt (Kidney)Imaging mass spectrometry (IMS) is a rapidly advancing molecular imaging modality that can map the spatial distribution of molecules with high chemical specificity. IMS does not require prior tagging of molecular targets and is able to measure a large number of ions concurrently in a single experiment. While this makes it particularly suited for exploratory analysis, the large amount and high‐dimensional nature of data generated by IMS techniques make automated computational analysis indispensable. Research into computational methods for IMS data has touched upon different aspects, including spectral preprocessing, data formats, dimensionality reduction, spatial registration, sample classification, differential analysis between IMS experiments, and data‐driven fusion methods to extract patterns corroborated by both IMS and other imaging modalities. In this work, we review unsupervised machine learning methods for exploratory analysis of IMS data, with particular focus on (a) factorization, (b) clustering, and (c) manifold learning. To provide a view across the various IMS modalities, we have attempted to include examples from a range of approaches including matrix assisted laser desorption/ionization, desorption electrospray ionization, and secondary ion mass spectrometry‐based IMS. This review aims to be an entry point for both (i) analytical chemists and mass spectrometry experts who want to explore computational techniques; and (ii) computer scientists and data mining specialists who want to enter the IMS field.
2020-05-14Use of Single Cell -omic Technologies to Study the Gastrointestinal Tract and Diseases, From Single Cell Identities to Patient FeaturesIslam M, Chen B, Spraggins JM, Kelly RT, Lau KS.TMC-Vanderbilt (Kidney)Single cells are the building blocks of tissue systems that determine organ phenotypes, behaviors, and function. Understanding the differences between cell types and their activities might provide us with insights into normal tissue functions, development of disease, and new therapeutic strategies. Although -omic level single cell technologies are a relatively recent development that been used only in laboratory studies, these approaches might eventually be used in the clinic. We review the prospects of applying single cell genome, transcriptome, epigenome, proteome, and metabolome analyses to gastroenterology and hepatology research. Combining data from multi-omic platforms and rapid technological developments could lead to new diagnostic, prognostic, and therapeutic approaches.
2020-05-19Discovering New Lipidomic Features Using Cell Type Specific Fluorophore Expression to Provide Spatial and Biological Specificity in a Multimodal Workflow with MALDI Imaging Mass SpectrometryJones MA, Cho SH, Patterson NH, Van de Plas R, Spraggins JM, Boothby MR, Caprioli RM.TMC-Vanderbilt (Kidney)Identifying the spatial distributions of biomolecules in tissue is crucial for understanding integrated function. Imaging mass spectrometry (IMS) allows simultaneous mapping of thousands of biosynthetic products such as lipids but has needed a means of identifying specific cell-types or functional states to correlate with molecular localization. We report, here, advances starting from identity marking with a genetically encoded fluorophore. The fluorescence emission data were integrated with IMS data through multimodal image processing with advanced registration techniques and data-driven image fusion. In an unbiased analysis of spleens, this integrated technology enabled identification of ether lipid species preferentially enriched in germinal centers. We propose that this use of genetic marking for microanatomical regions of interest can be paired with molecular information from IMS for any tissue, cell-type, or activity state for which fluorescence is driven by a gene-tracking allele and ultimately with outputs of other means of spatial mapping.
2020-06-16Single-cell Lineage Tracing by Integrating CRISPR-Cas9 Mutations With Transcriptomic DataZafar H, Lin C, Bar-Joseph Z.HIVE TC-CMURecent studies combine two novel technologies, single-cell RNA-sequencing and CRISPR-Cas9 barcode editing for elucidating developmental lineages at the whole organism level. While these studies provided several insights, they face several computational challenges. First, lineages are reconstructed based on noisy and often saturated random mutation data. Additionally, due to the randomness of the mutations, lineages from multiple experiments cannot be combined to reconstruct a species-invariant lineage tree. To address these issues we developed a statistical method, LinTIMaT, which reconstructs cell lineages using a maximum-likelihood framework by integrating mutation and expression data. Our analysis shows that expression data helps resolve the ambiguities arising in when lineages are inferred based on mutations alone, while also enabling the integration of different individual lineages for the reconstruction of an invariant lineage tree. LinTIMaT lineages have better cell type coherence, improve the functional significance of gene sets and provide new insights on progenitors and differentiation pathways.
2020-06-30A Cancer Biologist's Primer on Machine Learning Applications in High-Dimensional CytometryKeyes TJ, Domizi P, Lo YC, Nolan GP, Davis KLTMC-StanfordThe application of machine learning and artificial intelligence to high-dimensional cytometry data sets has increasingly become a staple of bioinformatic data analysis over the past decade. This is especially true in the field of cancer biology, where protocols for collecting multiparameter single-cell data in a high-throughput fashion are rapidly developed. As the use of machine learning methodology in cytometry becomes increasingly common, there is a need for cancer biologists to understand the basic theory and applications of a variety of algorithmic tools for analyzing and interpreting cytometry data. We introduce the reader to several keystone machine learning-based analytic approaches with an emphasis on defining key terms and introducing a conceptual framework for making translational or clinically relevant discoveries. The target audience consists of cancer cell biologists and physician-scientists interested in applying these tools to their own data, but who may have limited training in bioinformatics. © 2020 International Society for Advancement of Cytometry.
2020-07-14An Integrated Multi-omic Single-Cell Atlas of Human B Cell Identity.Glass DR, Tsai AG, Oliveria JP, Hartmann FJ, Kimmey SC, Calderon AA, Borges L, Glass MC, Wagar LE, Davis MM, Bendall SC.RTI-StanfordB cells are capable of a wide range of effector functions including antibody secretion, antigen presentation, cytokine production, and generation of immunological memory. A consistent strategy for classifying human B cells by using surface molecules is essential to harness this functional diversity for clinical translation. We developed a highly multiplexed screen to quantify the co-expression of 351 surface molecules on millions of human B cells. We identified differentially expressed molecules and aligned their variance with isotype usage, VDJ sequence, metabolic profile, biosynthesis activity, and signaling response. Based on these analyses, we propose a classification scheme to segregate B cells from four lymphoid tissues into twelve unique subsets, including a CD45RB+CD27- early memory population, a class-switched CD39+ tonsil-resident population, and a CD19hiCD11c+ memory population that potently responds to immune activation. This classification framework and underlying datasets provide a resource for further investigations of human B cell identity and function.
2020-07-16Localization of the lens intermediate filament switch by imaging mass spectrometryWang Z, Ryan DJ, Schey KLTMC-Vanderbilt (Eye/pancreas)Imaging mass spectrometry (IMS) enables targeted and untargeted visualization of the spatial localization of molecules in tissues with great specificity. The lens is a unique tissue that contains fiber cells corresponding to various stages of differentiation that are packed in a highly spatial order. The application of IMS to lens tissue localizes molecular features that are spatially related to the fiber cell organization. Such spatially resolved molecular information assists our understanding of lens structure and physiology; however, protein IMS studies are typically limited to abundant, soluble, low molecular weight proteins. In this study, a method was developed for imaging low solubility cytoskeletal proteins in the lens; a tissue that is filled with high concentrations of soluble crystallins. Optimized tissue washes combined with on-tissue enzymatic digestion allowed successful imaging of peptides corresponding to known lens cytoskeletal proteins. The resulting peptide signals facilitated segmentation of the bovine lens into molecularly distinct regions. A sharp intermediate filament transition from vimentin to lens-specific beaded filament proteins was detected in the lens cortex. MALDI IMS also revealed the region where posttranslational myristoylation of filensin occurs and the results indicate that truncation and myristoylation of filensin starts soon after filensin expression increased in the inner cortex. From intermediate filament switch to filensin truncation and myristoylation, multiple remarkable changes occur in the narrow region of lens cortex. MALDI images delineated the boundaries of distinct lens regions that will guide further proteomic and interactomic studies.
2020-07-23Multimodal Analysis of Composition and Spatial Architecture in Human Squamous Cell CarcinomaJi AL, Rubin AJ, Thrane K, Jiang S, Reynolds DL, Meyers RM, Guo MG, George BM, Mollbrink A, Bergenstråhle J, Larsson L, Bai Y, Zhu B, Bhaduri A, Meyers JM, Rovira-Clavé X, Hollmig ST, Aasi SZ, Nolan GP, Lundeberg J, Khavari PATMC-StanfordTo define the cellular composition and architecture of cutaneous squamous cell carcinoma (cSCC), we combined single-cell RNA sequencing with spatial transcriptomics and multiplexed ion beam imaging from a series of human cSCCs and matched normal skin. cSCC exhibited four tumor subpopulations, three recapitulating normal epidermal states, and a tumor-specific keratinocyte (TSK) population unique to cancer, which localized to a fibrovascular niche. Integration of single-cell and spatial data mapped ligand-receptor networks to specific cell types, revealing TSK cells as a hub for intercellular communication. Multiple features of potential immunosuppression were observed, including T regulatory cell (Treg) co-localization with CD8 T cells in compartmentalized tumor stroma. Finally, single-cell characterization of human tumor xenografts and in vivo CRISPR screens identified essential roles for specific tumor subpopulation-enriched gene networks in tumorigenesis. These data define cSCC tumor and stromal cell subpopulations, the spatial niches where they interact, and the communicating gene networks that they engage in cancer.
2020-08-31Single-cell metabolic profiling of human cytotoxic T cellsHartmann FJ, Mrdjen D, McCaffrey E, Glass DR, Greenwald NF, Bharadwaj A, Khair Z, Verberk SGS, Baranski A, Baskar R, Graf W, Van Valen D, Van den Bossche J, Angelo M, Bendall SC.RTI-StanfordCellular metabolism regulates immune cell activation, differentiation and effector functions, but current metabolic approaches lack single-cell resolution and simultaneous characterization of cellular phenotype. In this study, we developed an approach to characterize the metabolic regulome of single cells together with their phenotypic identity. The method, termed single-cell metabolic regulome profiling (scMEP), quantifies proteins that regulate metabolic pathway activity using high-dimensional antibody-based technologies. We employed mass cytometry (cytometry by time of flight, CyTOF) to benchmark scMEP against bulk metabolic assays by reconstructing the metabolic remodeling of in vitro-activated naive and memory CD8+ T cells. We applied the approach to clinical samples and identified tissue-restricted, metabolically repressed cytotoxic T cells in human colorectal carcinoma. Combining our method with multiplexed ion beam imaging by time of flight (MIBI-TOF), we uncovered the spatial organization of metabolic programs in human tissues, which indicated exclusion of metabolically repressed immune cells from the tumor-immune boundary. Overall, our approach enables robust approximation of metabolic and functional states in individual cells.
2020-09-01Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomesShafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE, Bosworth C, Armstrong J, Tigyi K, Maurer N, Koren S, Sedlazeck FJ, Marschall T, Mayes S, Costa V, Zook JM, Liu KJ, Kilburn D, Sorensen M, Munson KM, Vollger MR, Monlong J, Garrison E, Eichler EE, Salama S, Haussler D, Green RE, Akeson M, Phillippy A, Miga KH, Carnevali P, Jain M, Paten BHIVE TC-CMUDe novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed.
2020-09-01Changes to Zonular Tension Alters the Subcellular Distribution of AQP5 in Regions of Influx and Efflux of Water in the Rat LensPetrova RS, Bavana N, Zhao R, Schey KL, Donaldson PJTMC-Vanderbilt (Eye/pancreas)Purpose: The lens uses circulating fluxes of ions and water that enter the lens at both poles and exit at the equator to maintain its optical properties. We have mapped the subcellular distribution of the lens aquaporins (AQP0, AQP1, and AQP5) in these water influx and efflux zones and investigated how their membrane location is affected by changes in tension applied to the lens by the zonules. Methods: Immunohistochemistry using AQP antibodies was performed on axial sections obtained from rat lenses that had been removed from the eye and then fixed or were fixed in situ to maintain zonular tension. Zonular tension was pharmacologically modulated by applying either tropicamide (increased) or pilocarpine (decreased). AQP labeling was visualized using confocal microscopy. Results: Modulation of zonular tension had no effect on AQP1 or AQP0 labeling in either the water efflux or influx zones. In contrast, AQP5 labeling changed from membranous to cytoplasmic in response to both mechanical and pharmacologically induced reductions in zonular tension in both the efflux zone and anterior (but not posterior) influx zone associated with the lens sutures. Conclusions: Altering zonular tension dynamically regulates the membrane trafficking of AQP5 in the efflux and anterior influx zones to potentially change the magnitude of circulating water fluxes in the lens.
2020-09-03Coordinated Cellular Neighborhoods Orchestrate Antitumoral Immunity at the Colorectal Cancer Invasive FrontSchürch CM, Bhate SS, Barlow GL, Phillips DJ, Noti L, Zlobec I, Chu P, Black S, Demeter J, McIlwain DR, Kinoshita S, Samusik N, Goltsev Y, Nolan GP.TMC-StanfordAntitumoral immunity requires organized, spatially nuanced interactions between components of the immune tumor microenvironment (iTME). Understanding this coordinated behavior in effective versus ineffective tumor control will advance immunotherapies. We re-engineered co-detection by indexing (CODEX) for paraffin-embedded tissue microarrays, enabling simultaneous profiling of 140 tissue regions from 35 advanced-stage colorectal cancer (CRC) patients with 56 protein markers. We identified nine conserved, distinct cellular neighborhoods (CNs)-a collection of components characteristic of the CRC iTME. Enrichment of PD-1+CD4+ T cells only within a granulocyte CN positively correlated with survival in a high-risk patient subset. Coupling of tumor and immune CNs, fragmentation of T cell and macrophage CNs, and disruption of inter-CN communication was associated with inferior outcomes. This study provides a framework for interrogating how complex biological processes, such as antitumoral immunity, occur through concerted actions of cells and spatial domains.
2020-09-04The impact of air transport availability on research collaboration: A case study of four universitiesPloszaj A, Yan X, Börner K.HIVE MC-IUThis paper analyzes the impact of air transport connectivity and accessibility on scientific collaboration. Numerous studies demonstrated that the likelihood of collaboration declines with increase in distance between potential collaborators. These works commonly use simple measures of physical distance rather than actual flight capacity and frequency. Our study addresses this limitation by focusing on the relationship between flight availability and the number of scientific co-publications. Furthermore, we distinguish two components of flight availability: (1) direct and indirect air connections between airports; and (2) distance to the nearest airport from cities and towns where authors of scientific articles have their professional affiliations. Based on Zero-inflated Negative Binomial Regression, we provide evidence that greater flight availability is associated with more frequent scientific collaboration. More flight connections (connectivity) and proximity of airport (accessibility) increase the expected number of coauthored scientific papers. Moreover, direct flights and flights with one transfer are more valuable for intensifying scientific cooperation than travels involving more connecting flights. Further, analysis of four organizational sub-datasets-Arizona State University, Indiana University Bloomington, Indiana University-Purdue University Indianapolis, and University of Michigan-shows that the relationship between airline transport availability and scientific collaboration is not uniform, but is associated with the research profile of an institution and the characteristics of the airport that serves this institution.
2020-09-29Targeting Phosphotyrosine in Native Proteins with Conditional, Bispecific Antibody TrapsZhou XX, Bracken CJ, Zhang K, Zhou J, Mou Y, Wang L, Cheng Y, Leung KK, Wells JA.RTI-NorthwesternEngineering sequence-specific antibodies (Abs) against phosphotyrosine (pY) motifs embedded in folded polypeptides remains highly challenging because of the stringent requirement for simultaneous recognition of the pY motif and the surrounding folded protein epitope. Here, we present a method named phosphotyrosine Targeting by Recombinant Ab Pair, or pY-TRAP, for in vitro engineering of binders for native pY proteins. Specifically, we create the pY protein by unnatural amino acid misincorporation, mutagenize a universal pY-binding Ab to create a first binder B1 for the pY motif on the pY protein, and then select against the B1-pY protein complex for a second binder B2 that recognizes the composite epitope of B1 and the pY-containing protein complex. We applied pY-TRAP to create highly specific binders to folded Ub-pY59, a rarely studied Ub phosphoform exclusively observed in cancerous tissues, and ZAP70-pY248, a kinase phosphoform regulated in feedback signaling pathways in T cells. The pY-TRAPs do not have detectable binding to wild-type proteins or to other pY peptides or proteins tested. This pY-TRAP approach serves as a generalizable method for engineering sequence-specific Ab binders to native pY proteins.
2020-10-06Spatial metabolomics of the human kidney using MALDI trapped ion mobility imaging mass spectrometryNeumann EK, Migas LG, Allen JL, Caprioli RM, Van de Plas R, Spraggins JMTMC-Vanderbilt (Kidney)Low molecular weight metabolites are essential for defining the molecular phenotypes of cells. However, spatial metabolomics tools often lack the sensitivity, specify, and spatial resolution to provide comprehensive descriptions of these species in tissue. MALDI imaging mass spectrometry (IMS) of low molecular weight ions is particularly challenging as MALDI matrix clusters are often nominally isobaric with multiple metabolite ions, requiring high resolving power instrumentation or derivatization to circumvent this issue. An alternative to this is to perform ion mobility separation before ion detection, enabling the visualization of metabolites without the interference of matrix ions. Additional difficulties surrounding low weight metabolite visualization include high resolution imaging, while maintaining sufficient ion numbers for broad and representative analysis of the tissue chemical complement. Here, we use MALDI timsTOF IMS to image low molecular weight metabolites at higher spatial resolution than most metabolite MALDI IMS experiments (20 μm) while maintaining broad coverage within the human kidney. We demonstrate that trapped ion mobility spectrometry (TIMS) can resolve matrix peaks from metabolite signal and separate both isobaric and isomeric metabolites with different distributions within the kidney. The added ion mobility data dimension dramatically increased the peak capacity for spatial metabolomics experiments. Through this improved sensitivity, we have found >40 low molecular weight metabolites in human kidney tissue, such as argininic acid, acetylcarnitine, and choline that localize to the cortex, medulla, and renal pelvis, respectively. Future work will involve further exploring metabolomic profiles of human kidneys as a function of age, sex, and race.
2020-10-09An Integrated Microfluidic Probe for Mass Spectrometry Imaging of Biological Samples*Li X, Yin R, Hu H, Li Y, Sun X, Dey SK, Laskin J.TTD-PurdueAmbient ionization based on liquid extraction is widely used in mass spectrometry imaging (MSI) of molecules in biological samples. The development of nanospray desorption electrospray ionization (nano-DESI) has enabled the robust imaging of tissue sections with high spatial resolution. However, the fabrication of the nano-DESI probe is challenging, which limits its dissemination to the broader scientific community. Herein, we describe the design and performance of an integrated microfluidic probe (iMFP) for nano-DESI MSI. The glass iMFP, fabricated using photolithography, wet etching, and polishing, shows comparable performance to the capillary-based nano-DESI MSI in terms of stability and sensitivity; a spatial resolution of better than 25 μm was obtained in these first proof-of-principle experiments. The iMFP is easy to operate and align in front of a mass spectrometer, which will facilitate broader use of liquid-extraction-based MSI in biological research, drug discovery, and clinical studies.
2020-10-19CDKL5: a promising new therapeutic target for acute kidney injury?de Caestecker MP.TMC-Vanderbilt (Kidney)Online ahead of print. No abstract available.
2020-10-27Iterative point set registration for aligning scRNA-seq dataAlavi A, Bar-Joseph ZHIVE TC-CMUSeveral studies profile similar single cell RNA-Seq (scRNA-Seq) data using different technologies and platforms. A number of alignment methods have been developed to enable the integration and comparison of scRNA-Seq data from such studies. While each performs well on some of the datasets, to date no method was able to both perform the alignment using the original expression space and generalize to new data. To enable such analysis we developed Single Cell Iterative Point set Registration (SCIPR) which extends methods that were successfully applied to align image data to scRNA-Seq. We discuss the required changes needed, the resulting optimization function, and algorithms for learning a transformation function for aligning data. We tested SCIPR on several scRNA-Seq datasets. As we show it successfully aligns data from several different cell types, improving upon prior methods proposed for this task. In addition, we show the parameters learned by SCIPR can be used to align data not used in the training and to identify key cell type-specific genes.
2020-11-01High-Parameter Immune Profiling with CyTOFSahaf B, Rahman A, Maecker HT, Bendall SCRTI-StanfordMass cytometry, or CyTOF, is a useful technology for high-parameter single-cell phenotyping, especially from suspension cells such as blood or PBMC. It is particularly appealing to monitor the systemic immune changes that could accompany cancer immunotherapy. Here we present a reference panel for identification of all major immune cell populations, with flexibility for addition of trial-specific markers. We also describe best-practice measures for minimizing and tracking batch variability. These include: sample barcoding, use of spiked-in reference cells, and lyophilization of the antibody cocktail. Our protocol assumes the use of cryopreserved PBMC, both for convenience of batching samples and for maximum comparability across patients and time points. Finally, we show an option for automated analysis using the Astrolabe platform (Astrolabe Diagnostics, Inc.).
2020-11-02Landscape of coordinated immune responses to H1N1 challenge in humans.Rahil Z, Leylek R, Schürch CM, Chen H, Bjornson-Hooper Z, Christensen SR, Gherardini PF, Bhate SS, Spitzer MH, Fragiadakis GK, Mukherjee N, Kim N, Jiang S, Yo J, Gaudilliere B, Affrime M, Bock B, Hensley SE, Idoyaga J, Aghaeepour N, Kim K, Nolan GP, McIlwain DR.TMC-StanfordInfluenza is a significant cause of morbidity and mortality worldwide. Here we show changes in the abundance and activation states of more than 50 immune cell subsets in 35 individuals over 11 time points during human A/California/2009 (H1N1) virus challenge monitored using mass cytometry along with other clinical assessments. Peak change in monocyte, B cell, and T cell subset frequencies coincided with peak virus shedding, followed by marked activation of T and NK cells. Results led to the identification of CD38 as a critical regulator of plasmacytoid dendritic cell function in response to influenza virus. Machine learning using study-derived clinical parameters and single-cell data effectively classified and predicted susceptibility to infection. The coordinated immune cell dynamics defined in this study provide a framework for identifying novel correlates of protection in the evaluation of future influenza therapeutics.
2020-11-05Advances in Proximity Ligation in situ Hybridization (PLISH)Nagendran M, Andruska AM, Harbury PB, Desai TJ.TTD-StanfordUnderstanding tissues in the context of development, maintenance and disease requires determining the molecular profiles of individual cells within their native in vivo spatial context. We developed a Proximity Ligation in situ Hybridization technology (PLISH) that enables quantitative measurement of single cell gene expression in intact tissues, which we have now updated. By recording spatial information for every profiled cell, PLISH enables retrospective mapping of distinct cell classes and inference of their in vivo interactions. PLISH has high sensitivity, specificity and signal to noise ratio. It is also rapid, scalable, and does not require expertise in molecular biology so it can be easily adopted by basic and clinical researchers.
2020-11-06Carrier-assisted One-pot Sample Preparation for Targeted Proteomics Analysis of Small Numbers of Human CellsMartin K, Zhang T, Zhang P, Chrisler WB, Thomas FL, Liu F, Liu T, Qian WJ, Smith RD, Shi T.TTD-PNNL/NorthwesternProtein analysis of small numbers of human cells is primarily achieved by targeted proteomics with antibody-based immunoassays, which have inherent limitations (e.g., low multiplex and unavailability of antibodies for new proteins). Mass spectrometry (MS)-based targeted proteomics has emerged as an alternative because it is antibody-free, high multiplex, and has high specificity and quantitation accuracy. Recent advances in MS instrumentation make MS-based targeted proteomics possible for multiplexed quantification of highly abundant proteins in single cells. However, there is a technical challenge for effective processing of single cells with minimal sample loss for MS analysis. To address this issue, we have recently developed a convenient protein carrier-assisted one-pot sample preparation coupled with liquid chromatography (LC) - selected reaction monitoring (SRM) termed cLC-SRM for targeted proteomics analysis of small numbers of human cells. This method capitalizes on using the combined excessive exogenous protein as a carrier and low-volume one-pot processing to greatly reduce surface adsorption losses as well as high-specificity LC-SRM to effectively address the increased dynamic concentration range due to the addition of exogeneous carrier protein. Its utility has been demonstrated by accurate quantification of most moderately abundant proteins in small numbers of cells (e.g., 10-100 cells) and highly abundant proteins in single cells. The easy-to-implement features and no need for specific devices make this method readily accessible to most proteomics laboratories. Herein we have provided a detailed protocol for cLC-SRM analysis of small numbers of human cells including cell sorting, cell lysis and digestion, LC-SRM analysis, and data analysis. Further improvements in detection sensitivity and sample throughput are needed towards targeted single-cell proteomics analysis. We anticipate that cLC-SRM will be broadly applied to biomedical research and systems biology with the potential of facilitating precision medicine.
2020-11-13Guidelines for reporting single-cell RNA-seq experiments.Füllgrabe A, George N, Green M, Nejad P, Aronow B, Fexova SK, Fischer C, Freeberg MA, Huerta L, Morrison N, Scheuermann RH, Taylor D, Vasilevsky N, Clarke L, Gehlenborg N, Kent J, Marioni J, Teichmann S, Brazma A, Papatheodorou IHIVE TC-HarvardNo abstract available.
2020-11-30Tetraspanin-7 regulation of L-type voltage-dependent calcium channels controls pancreatic β-cell insulin secretionDickerson MT, Dadi PK, Butterworth RB, Nakhe AY, Graff SM, Zaborska KE, Schaub CM, Jacobson DATMC-Vanderbilt (Eye/pancreas)Key points: Tetraspanin (TSPAN) proteins regulate many biological processes, including intracellular calcium (Ca2+ ) handling. TSPAN-7 is enriched in pancreatic islet cells; however, the function of islet TSPAN-7 has not been identified. Here, we characterize how β-cell TSPAN-7 regulates Ca2+ handling and hormone secretion. We find that TSPAN-7 reduces β-cell glucose-stimulated Ca2+ entry, slows Ca2+ oscillation frequency and decreases glucose-stimulated insulin secretion. TSPAN-7 controls β-cell function through a direct interaction with L-type voltage-dependent Ca2+ channels (CaV 1.2 and CaV 1.3), which reduces channel Ca2+ conductance. TSPAN-7 slows activation of CaV 1.2 and accelerates recovery from voltage-dependent inactivation; TSPAN-7 also slows CaV 1.3 inactivation kinetics. These findings strongly implicate TSPAN-7 as a key regulator in determining the set-point of glucose-stimulated Ca2+ influx and insulin secretion. Abstract: Glucose-stimulated insulin secretion (GSIS) is regulated by calcium (Ca2+ ) entry into pancreatic β-cells through voltage-dependent Ca2+ (CaV ) channels. Tetraspanin (TSPAN) transmembrane proteins control Ca2+ handling, and thus they may also modulate GSIS. TSPAN-7 is the most abundant islet TSPAN and immunostaining of mouse and human pancreatic slices shows that TSPAN-7 is highly expressed in β- and α-cells; however, the function of islet TSPAN-7 has not been determined. Here, we show that TSPAN-7 knockdown (KD) increases glucose-stimulated Ca2+ influx into mouse and human β-cells. Additionally, mouse β-cell Ca2+ oscillation frequency was accelerated by TSPAN-7 KD. Because TSPAN-7 KD also enhanced Ca2+ entry when membrane potential was clamped with depolarization, the effect of TSPAN-7 on CaV channel activity was examined. TSPAN-7 KD enhanced L-type CaV currents in mouse and human β-cells. Conversely, heterologous expression of TSPAN-7 with CaV 1.2 and CaV 1.3 L-type CaV channels decreased CaV currents and reduced Ca2+ influx through both channels. This was presumably the result of a direct interaction of TSPAN-7 and L-type CaV channels because TSPAN-7 coimmunoprecipitated with both CaV 1.2 and CaV 1.3 from primary human β-cells and from a heterologous expression system. Finally, TSPAN-7 KD in human β-cells increased basal (5.6 mM glucose) and stimulated (45 mM KCl + 14 mM glucose) insulin secretion. These findings strongly suggest that TSPAN-7 modulation of β-cell L-type CaV channels is a key determinant of β-cell glucose-stimulated Ca2+ entry and thus the set-point of GSIS.
2020-12-01Effect of MALDI matrices on lipid analyses of biological tissues using MALDI-2 postionization mass spectrometryMcMillen JC, Fincher JA, Klein DR, Spraggins JM, Caprioli RMTMC-Vanderbilt (Kidney)Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) allows for highly multiplexed, untargeted detection of many hundreds of analytes from tissue. Recently, laser postionization (MALDI-2) has been developed for increased ion yield and sensitivity for lipid IMS. However, the dependence of MALDI-2 performance on the various lipid classes is largely unknown. To understand the effect of the applied matrix on MALDI-2 analysis of lipids, samples including an equimolar lipid standard mixture, various tissue homogenates, and intact rat kidney tissue sections were analyzed using the following matrices: α-cyano-4-hydroxycinnamic acid, 2',5'-dihydroxyacetophenone, 2',5'-dihydroxybenzoic acid (DHB), and norharmane (NOR). Lipid signal enhancement of protonated species using MALDI-2 technology varied based on the matrix used. Although signal improvements were observed for all matrices, the most dramatic effects using MALDI-2 were observed using NOR and DHB. For lipid standards analyzed by MALDI-2, NOR provided the broadest coverage, enabling the detection of all 13 protonated standards, including nonpolar lipids, whereas DHB gave less coverage but gave the highest signal increase for those lipids recorded. With respect to tissue homogenates and rat kidney tissue, mass spectra were compared and showed that the number and intensity of neutral lipids tentatively identified with MALDI-2 using NOR increased significantly (e.g., fivefold intensity increase for triacylglycerol). In the cases of DHB with MALDI-2, the number of protonated lipids identified from tissue homogenates doubled with 152 on average compared with 76 with MALDI alone. High spatial resolution imaging (~20 μm) of rat kidney tissue showed similar results using DHB with 125 lipids tentatively identified from MALDI-2 spectra versus just 72 using standard MALDI. From the four matrices tested, NOR provided the greatest increase in sensitivity for neutral lipids (triacylglycerol, diacylglycerol, monoacylglycerol, and cholesterol ester), and DHB provided the highest overall number of lipids detected using MALDI-2 technology.
2020-12-01Integrating ion mobility and imaging mass spectrometry for comprehensive analysis of biological tissues: A brief review and perspectiveRivera ES, Djambazova KV, Neumann EK, Caprioli RM, Spraggins JMTMC-Vanderbilt (Kidney)Imaging mass spectrometry (IMS) technologies are capable of mapping a wide array of biomolecules in diverse cellular and tissue environments. IMS has emerged as an essential tool for providing spatially targeted molecular information due to its high sensitivity, wide molecular coverage, and chemical specificity. One of the major challenges for mapping the complex cellular milieu is the presence of many isomers and isobars in these samples. This challenge is traditionally addressed using orthogonal liquid chromatography (LC)-based analysis, though, common approaches such as chromatography and electrophoresis are not able to be performed at timescales that are compatible with most imaging applications. Ion mobility offers rapid, gas-phase separations that are readily integrated with IMS workflows in order to provide additional data dimensionality that can improve signal-to-noise, dynamic range, and specificity. Here, we highlight recent examples of ion mobility coupled to IMS and highlight their importance to the field.
2020-12-01Progenitor identification and SARS-CoV-2 infection in human distal lung organoidsSalahudeen AA, Choi SS, Rustagi A, Zhu J, van Unen V, de la O SM, Flynn RA, Margalef-Català M, Santos AJM, Ju J, Batish A, Usui T, Zheng GXY, Edwards CE, Wagar LE, Luca V, Anchang B, Nagendran M, Nguyen K, Hart DJ, Terry JM, Belgrader P, Ziraldo SB, Mikkelsen TS, Harbury PB, Glenn JS, Garcia KC, Davis MM, Baric RS, Sabatti C, Amieva MR, Blish CA, Desai TJ, Kuo CJ.TTD-StanfordThe distal lung contains terminal bronchioles and alveoli that facilitate gas exchange. Three-dimensional in vitro human distal lung culture systems would strongly facilitate the investigation of pathologies such as interstitial lung disease, cancer and coronavirus disease 2019 (COVID-19) pneumonia caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Here we describe the development of a long-term feeder-free, chemically defined culture system for distal lung progenitors as organoids derived from single adult human alveolar epithelial type II (AT2) or KRT5+ basal cells. AT2 organoids were able to differentiate into AT1 cells, and basal cell organoids developed lumens lined with differentiated club and ciliated cells. Single-cell analysis of KRT5+ cells in basal organoids revealed a distinct population of ITGA6+ITGB4+ mitotic cells, whose offspring further segregated into a TNFRSF12Ahi subfraction that comprised about ten per cent of KRT5+ basal cells. This subpopulation formed clusters within terminal bronchioles and exhibited enriched clonogenic organoid growth activity. We created distal lung organoids with apical-out polarity to present ACE2 on the exposed external surface, facilitating infection of AT2 and basal cultures with SARS-CoV-2 and identifying club cells as a target population. This long-term, feeder-free culture of human distal lung organoids, coupled with single-cell analysis, identifies functional heterogeneity among basal cells and establishes a facile in vitro organoid model of human distal lung infections, including COVID-19-associated pneumonia.
2020-12-02Lipid Landscape of the Human Retina and Supporting Tissues Revealed by High-Resolution Imaging Mass SpectrometryAnderson DMG, Messinger JD, Patterson NH, Rivera ES, Kotnala A, Spraggins JM, Caprioli RM, Curcio CA, Schey KL.TMC-Vanderbilt (Eye/pancreas)The human retina provides vision at light levels ranging from starlight to sunlight. Its supporting tissues regulate plasma-delivered lipophilic essentials for vision, including retinoids. The macula is an anatomic specialization for high-acuity and color vision that is also vulnerable to prevalent blinding diseases. The retina's exquisite architecture comprises numerous cell types that are aligned horizontally, yielding structurally distinct cell, synaptic, and vascular layers that are visible in histology and in diagnostic clinical imaging. MALDI imaging mass spectrometry (IMS) is now capable of uniting low micrometer spatial resolution with high levels of chemical specificity. In this study, a multimodal imaging approach fortified with accurate multi-image registration was used to localize lipids in human retina tissue at laminar, cellular, and subcellular levels. Multimodal imaging results indicate differences in distributions and abundances of lipid species across and within single cell types. Of note are distinct localizations of signals within specific layers of the macula. For example, phosphatidylethanolamine and phosphatidylinositol lipids were localized to central RPE cells, whereas specific plasmalogen lipids were localized to cells of the perifoveal RPE and Henle fiber layer. Subcellular compartments of photoreceptors were distinguished by PE(20:0_22:5) in the outer nuclear layer, PE(18:0_22:6) in outer and inner segments, and cardiolipin CL(70:5) in the mitochondria-rich inner segments. Several lipids, differing by a single double bond, have markedly different distributions between the central fovea and the ganglion cell and inner nuclear layers. A lipid atlas, initiated in this study, can serve as a reference database for future examination of diseased tissues.
2020-12-02Multimodal Imaging Mass Spectrometry: Next Generation Molecular Mapping in Biology and MedicineNeumann EK, Djambazova KV, Caprioli RM, Spraggins JMTMC-Vanderbilt (Kidney)Imaging mass spectrometry has become a mature molecular mapping technology that is used for molecular discovery in many medical and biological systems. While powerful by itself, imaging mass spectrometry can be complemented by the addition of other orthogonal, chemically informative imaging technologies to maximize the information gained from a single experiment and enable deeper understanding of biological processes. Within this review, we describe MALDI, SIMS, and DESI imaging mass spectrometric technologies and how these have been integrated with other analytical modalities such as microscopy, transcriptomics, spectroscopy, and electrochemistry in a field termed multimodal imaging. We explore the future of this field and discuss forthcoming developments that will bring new insights to help unravel the molecular complexities of biological systems, from single cells to functional tissue structures and organs.
2020-12-10GCNG: graph convolutional networks for inferring gene interaction from spatial transcriptomics dataYuan Y, Bar-Joseph ZHIVE TC-CMUMost methods for inferring gene-gene interactions from expression data focus on intracellular interactions. The availability of high-throughput spatial expression data opens the door to methods that can infer such interactions both within and between cells. To achieve this, we developed Graph Convolutional Neural networks for Genes (GCNG). GCNG encodes the spatial information as a graph and combines it with expression data using supervised training. GCNG improves upon prior methods used to analyze spatial transcriptomics data and can propose novel pairs of extracellular interacting genes. The output of GCNG can also be used for downstream analysis including functional gene assignment.Supporting website with software and data: https://github.com/xiaoyeye/GCNG .
2020-12-10High-Spatial-Resolution Multi-Omics Sequencing via Deterministic Barcoding in TissueLiu Y, Yang M, Deng Y, Su G, Enninful A, Guo CC, Tebaldi T, Zhang D, Kim D, Bai Z, Norris E, Pan A, Li J, Xiao Y, Halene S, Fan RTTD-YaleWe present deterministic barcoding in tissue for spatial omics sequencing (DBiT-seq) for co-mapping of mRNAs and proteins in a formaldehyde-fixed tissue slide via next-generation sequencing (NGS). Parallel microfluidic channels were used to deliver DNA barcodes to the surface of a tissue slide, and crossflow of two sets of barcodes, A1-50 and B1-50, followed by ligation in situ, yielded a 2D mosaic of tissue pixels, each containing a unique full barcode AB. Application to mouse embryos revealed major tissue types in early organogenesis as well as fine features like microvasculature in a brain and pigmented epithelium in an eye field. Gene expression profiles in 10-μm pixels conformed into the clusters of single-cell transcriptomes, allowing for rapid identification of cell types and spatial distributions. DBiT-seq can be adopted by researchers with no experience in microfluidics and may find applications in a range of fields including developmental biology, cancer biology, neuroscience, and clinical pathology.
2020-12-18RIPK3-mediated inflammation is a conserved β cell response to ER stressYang B, Maddison LA, Zaborska KE, Dai C, Yin L, Tang Z, Zang L, Jacobson DA, Powers AC, Chen WTMC-Vanderbilt (Eye/pancreas)Islet inflammation is an important etiopathology of type 2 diabetes; however, the underlying mechanisms are not well defined. Using complementary experimental models, we discovered RIPK3-dependent IL1B induction in β cells as an instigator of islet inflammation. In cultured β cells, ER stress activated RIPK3, leading to NF-kB-mediated proinflammatory gene expression. In a zebrafish muscle insulin resistance model, overnutrition caused islet inflammation, β cell dysfunction, and loss in an ER stress-, ripk3-, and il1b-dependent manner. In mouse islets, high-fat diet triggered the IL1B expression in β cells before macrophage recruitment in vivo, and RIPK3 inhibition suppressed palmitate-induced β cell dysfunction and Il1b expression in vitro. Furthermore, in human islets grafted in hyperglycemic mice, a marked increase in ER stress, RIPK3, and NF-kB activation in β cells were accompanied with murine macrophage infiltration. Thus, RIPK3-mediated induction of proinflammatory mediators is a conserved, previously unrecognized β cell response to metabolic stress and a mediator of the ensuing islet inflammation.
2021-01-01A multimodal and integrated approach to interrogate human kidney biopsies with rigor and reproducibility: guidelines from the Kidney Precision Medicine ProjectEl-Achkar TM, Eadon MT, Menon R, Lake BB, Sigdel TK, Alexandrov T, Parikh S, Zhang G, Dobi D, Dunn KW, Otto EA, Anderton CR, Carson JM, Luo J, Park C, Hamidi H, Zhou J, Hoover P, Schroeder A, Joanes M, Azeloglu EU, Sealfon R, Winfree S, Steck B, He Y, D'Agati V, Iyengar R, Troyanskaya OG, Barisoni L, Gaut J, Zhang K, Laszik Z, Rovin BH, Dagher PC, Sharma K, Sarwal MM, Hodgin JB, Alpers CE, Kretzler M, Jain STMC-UCSDComprehensive and spatially mapped molecular atlases of organs at a cellular level are a critical resource to gain insights into pathogenic mechanisms and personalized therapies for diseases. The Kidney Precision Medicine Project (KPMP) is an endeavor to generate three-dimensional (3-D) molecular atlases of healthy and diseased kidney biopsies by using multiple state-of-the-art omics and imaging technologies across several institutions. Obtaining rigorous and reproducible results from disparate methods and at different sites to interrogate biomolecules at a single-cell level or in 3-D space is a significant challenge that can be a futile exercise if not well controlled. We describe a "follow the tissue" pipeline for generating a reliable and authentic single-cell/region 3-D molecular atlas of human adult kidney. Our approach emphasizes quality assurance, quality control, validation, and harmonization across different omics and imaging technologies from sample procurement, processing, storage, shipping to data generation, analysis, and sharing. We established benchmarks for quality control, rigor, reproducibility, and feasibility across multiple technologies through a pilot experiment using common source tissue that was processed and analyzed at different institutions and different technologies. A peer review system was established to critically review quality control measures and the reproducibility of data generated by each technology before their being approved to interrogate clinical biopsy specimens. The process established economizes the use of valuable biopsy tissue for multiomics and imaging analysis with stringent quality control to ensure rigor and reproducibility of results and serves as a model for precision medicine projects across laboratories, institutions and consortia.
2021-01-18Deep Learning Approach for Dynamic Sparse Sampling for High-Throughput Mass Spectrometry ImagingHelminiak D, Hu H, Laskin J, Ye DHTTD-PurdueA Supervised Learning Approach for Dynamic Sampling (SLADS) addresses traditional issues with the incorporation of stochastic processes into a compressed sensing method. Statistical features, extracted from a sample reconstruction, estimate entropy reduction with regression models, in order to dynamically determine optimal sampling locations. This work introduces an enhanced SLADS method, in the form of a Deep Learning Approach for Dynamic Sampling (DLADS), showing reductions in sample acquisition times for high-fidelity reconstructions between ~ 70-80% over traditional rectilinear scanning. These improvements are demonstrated for dimensionally asymmetric, high-resolution molecular images of mouse uterine and kidney tissues, as obtained using Nanospray Desorption ElectroSpray Ionization (nano-DESI) Mass Spectrometry Imaging (MSI). The methodology for training set creation is adjusted to mitigate stretching artifacts generated when using prior SLADS approaches. Transitioning to DLADS removes the need for feature extraction, further advanced with the employment of convolutional layers to leverage inter-pixel spatial relationships. Additionally, DLADS demonstrates effective generalization, despite dissimilar training and testing data. Overall, DLADS is shown to maximize potential experimental throughput for nano-DESI MSI.
2021-01-27Integrated spatial genomics reveals global architecture of single nucleiTakei Y, Yun J, Zheng S, Ollikainen N, Pierson N, White J, Shah S, Thomassie J, Suo S, Eng CL, Guttman M, Yuan GC, Cai L.TTD-Cal TechIdentifying the relationships between chromosome structures, nuclear bodies, chromatin states and gene expression is an overarching goal of nuclear-organization studies1-4. Because individual cells appear to be highly variable at all these levels5, it is essential to map different modalities in the same cells. Here we report the imaging of 3,660 chromosomal loci in single mouse embryonic stem (ES) cells using DNA seqFISH+, along with 17 chromatin marks and subnuclear structures by sequential immunofluorescence and the expression profile of 70 RNAs. Many loci were invariably associated with immunofluorescence marks in single mouse ES cells. These loci form 'fixed points' in the nuclear organizations of single cells and often appear on the surfaces of nuclear bodies and zones defined by combinatorial chromatin marks. Furthermore, highly expressed genes appear to be pre-positioned to active nuclear zones, independent of bursting dynamics in single cells. Our analysis also uncovered several distinct mouse ES cell subpopulations with characteristic combinatorial chromatin states. Using clonal analysis, we show that the global levels of some chromatin marks, such as H3 trimethylation at lysine 27 (H3K27me3) and macroH2A1 (mH2A1), are heritable over at least 3-4 generations, whereas other marks fluctuate on a faster time scale. This seqFISH+-based spatial multimodal approach can be used to explore nuclear organization and cell states in diverse biological systems.
2021-01-28A Generic Framework and Library for Exploration of Small Multiples through Interactive PilingLekschas F, Zhou X, Chen W, Gehlenborg N, Bach B, Pfister HHIVE TC-HarvardSmall multiples are miniature representations of visual information used generically across many domains. Handling large numbers of small multiples imposes challenges on many analytic tasks like inspection, comparison, navigation, or annotation. To address these challenges, we developed a framework and implemented a library called PILlNG.JS for designing interactive piling interfaces. Based on the piling metaphor, such interfaces afford flexible organization, exploration, and comparison of large numbers of small multiples by interactively aggregating visual objects into piles. Based on a systematic analysis of previous work, we present a structured design space to guide the design of visual piling interfaces. To enable designers to efficiently build their own visual piling interfaces, PILlNG.JS provides a declarative interface to avoid having to write low-level code and implements common aspects of the design space. An accompanying GUI additionally supports the dynamic configuration of the piling interface. We demonstrate the expressiveness of PILlNG.JS with examples from machine learning, immunofluorescence microscopy, genomics, and public health.
2021-01-31Predictive modeling of single-cell DNA methylome data enhances integration with transcriptome dataUzun Y, Wu H, Tan K.TMC-CHOPSingle-cell DNA methylation data has become increasingly abundant and has uncovered many genes with a positive correlation between expression and promoter methylation, challenging the common dogma based on bulk data. However, computational tools for analyzing single-cell methylome data are lagging far behind. A number of tasks, including cell type calling and integration with transcriptome data, requires the construction of a robust gene activity matrix as the prerequisite but challenging task. The advent of multi-omics data enables measurement of both DNA methylation and gene expression for the same single cells. Although such data is rather sparse, they are sufficient to train supervised models that capture the complex relationship between DNA methylation and gene expression and predict gene activities at single-cell level. Here, we present methylome association by predictive linkage to expression (MAPLE), a computational framework that learns the association between DNA methylation and expression using both gene- and cell-dependent statistical features. Using multiple data sets generated with different experimental protocols, we show that using predicted gene activity values significantly improves several analysis tasks, including clustering, cell type identification, and integration with transcriptome data. Application of MAPLE revealed several interesting biological insights into the relationship between methylation and gene expression, including asymmetric importance of methylation signals around transcription start site for predicting gene expression, and increased predictive power of methylation signals in promoters located outside CpG islands and shores. With the rapid accumulation of single-cell epigenomics data, MAPLE provides a general framework for integrating such data with transcriptome data.
2021-02-15Construction of a Multi-Phase Contrast Computed Tomography Kidney AtlasLee HH, Tang Y, Xu K, Bao S, Fogo AB, Harris R, de Caestecker MP, Heinrich M, Spraggins JM, Huo Y, Landman BATMC-Vanderbilt (Kidney)The Human BioMolecular Atlas Program (HuBMAP) seeks to create a molecular atlas at the cellular level of the human body to spur interdisciplinary innovations across spatial and temporal scales. While the preponderance of effort is allocated towards cellular and molecular scale mapping, differentiating and contextualizing findings within tissues, organs and systems are essential for the HuBMAP efforts. The kidney is an initial organ target of HuBMAP, and constructing a framework (or atlas) for integrating information across scales is needed for visualizing and integrating information. However, there is no abdominal atlas currently available in the public domain. Substantial variation in healthy kidneys exists with sex, body size, and imaging protocols. With the integration of clinical archives for secondary research use, we are able to build atlases based on a diverse population and clinically relevant protocols. In this study, we created a computed tomography (CT) phase-specific atlas for the abdomen, which is optimized for the kidney organ. A two-stage registration pipeline was used by registering extracted abdominal volume of interest from body part regression, to a high-resolution CT. Affine and non-rigid registration were performed to all scans hierarchically. To generate and evaluate the atlas, multiphase CT scans of 500 control subjects (age: 15 - 50, 250 males, 250 females) are registered to the atlas target through the complete pipeline. The abdominal body and kidney registration are shown to be stable with the variance map computed from the result average template. Both left and right kidneys are substantially localized in the high-resolution target space, which successfully demonstrated the sharp details of its anatomical characteristics across each phase. We illustrated the applicability of the atlas template for integrating across normal kidney variation from 64 cm3 to 302 cm3.
2021-02-15Renal Cortex, Medulla and Pelvicaliceal System Segmentation on Arterial Phase CT Images with Random Patch-based NetworksTang Y, Gao R, Lee HH, Xu Z, Savoie BV, Bao S, Huo Y, Fogo AB, Harris R, de Caestecker MP, Spraggins J, Landman BATMC-Vanderbilt (Kidney)Renal segmentation on contrast-enhanced computed tomography (CT) provides distinct spatial context and morphology. Current studies for renal segmentations are highly dependent on manual efforts, which are time-consuming and tedious. Hence, developing an automatic framework for the segmentation of renal cortex, medulla and pelvicalyceal system is an important quantitative assessment of renal morphometry. Recent innovations in deep methods have driven performance toward levels for which clinical translation is appealing. However, the segmentation of renal structures can be challenging due to the limited field-of-view (FOV) and variability among patients. In this paper, we propose a method to automatically label the renal cortex, the medulla and pelvicalyceal system. First, we retrieved 45 clinically-acquired deidentified arterial phase CT scans (45 patients, 90 kidneys) without diagnosis codes (ICD-9) involving kidney abnormalities. Second, an interpreter performed manual segmentation to pelvis, medulla and cortex slice-by-slice on all retrieved subjects under expert supervision. Finally, we proposed a patch-based deep neural networks to automatically segment renal structures. Compared to the automatic baseline algorithm (3D U-Net) and conventional hierarchical method (3D U-Net Hierarchy), our proposed method achieves improvement of 0.7968 to 0.6749 (3D U-Net), 0.7482 (3D U-Net Hierarchy) in terms of mean Dice scores across three classes (p-value < 0.001, paired t-tests between our method and 3D U-Net Hierarchy). In summary, the proposed algorithm provides a precise and efficient method for labeling renal structures.
2021-02-23Spatial Segmentation of Mass Spectrometry Imaging Data by Combining Multivariate Clustering and Univariate Thresholding.Hu H, Yin R, Brown HM, Laskin JTTD-PurdueSpatial segmentation partitions mass spectrometry imaging (MSI) data into distinct regions, providing a concise visualization of the vast amount of data and identifying regions of interest (ROIs) for downstream statistical analysis. Unsupervised approaches are particularly attractive, as they may be used to discover the underlying subpopulations present in the high-dimensional MSI data without prior knowledge of the properties of the sample. Herein, we introduce an unsupervised spatial segmentation approach, which combines multivariate clustering and univariate thresholding to generate comprehensive spatial segmentation maps of the MSI data. This approach combines matrix factorization and manifold learning to enable high-quality image segmentation without an extensive hyperparameter search. In parallel, some ion images inadequately represented in the multivariate analysis were treated using univariate thresholding to generate complementary spatial segments. The final spatial segmentation map was assembled from segment candidates that were generated using both techniques. We demonstrate the performance and robustness of this approach for two MSI data sets of mouse uterine and kidney tissue sections that were acquired with different spatial resolutions. The resulting segmentation maps are easy to interpret and project onto the known anatomical regions of the tissue.
2021-03-01Surfactant-assisted one-pot sample preparation for label-free single-cell proteomicsTsai, CF., Zhang, P., Scholten, D. et al.TTD-PNNL/NorthwesternLarge numbers of cells are generally required for quantitative global proteome profiling due to surface adsorption losses associated with sample processing. Such bulk measurement obscures important cell-to-cell variability (cell heterogeneity) and makes proteomic profiling impossible for rare cell populations (e.g., circulating tumor cells (CTCs)). Here we report a surfactant-assisted one-pot sample preparation coupled with mass spectrometry (MS) method termed SOP-MS for label-free global single-cell proteomics. SOP-MS capitalizes on the combination of a MS-compatible nonionic surfactant, n-Dodecyl-β-D-maltoside, and hydrophobic surface-based low-bind tubes or multi-well plates for ‘all-in-one’ one-pot sample preparation. This ‘all-in-one’ method including elimination of all sample transfer steps maximally reduces surface adsorption losses for effective processing of single cells, thus improving detection sensitivity for single-cell proteomics. This method allows convenient label-free quantification of hundreds of proteins from single human cells and ~1200 proteins from small tissue sections (close to ~20 cells). When applied to a patient CTC-derived xenograft (PCDX) model at the single-cell resolution, SOP-MS can reveal distinct protein signatures between primary tumor cells and early metastatic lung cells, which are related to the selection pressure of anti-tumor immunity during breast cancer metastasis. The approach paves the way for routine, precise, quantitative single-cell proteomics.
2021-03-08Giotto: a toolbox for integrative analysis and visualization of spatial expression dataDries R, Zhu Q, Dong R, Eng CL, Li H, Liu K, Fu Y, Zhao T, Sarkar A, Bao F, George RE, Pierson N, Cai L, Yuan GC.TTD-Cal TechSpatial transcriptomic and proteomic technologies have provided new opportunities to investigate cells in their native microenvironment. Here we present Giotto, a comprehensive and open-source toolbox for spatial data analysis and visualization. The analysis module provides end-to-end analysis by implementing a wide range of algorithms for characterizing tissue composition, spatial expression patterns, and cellular interactions. Furthermore, single-cell RNAseq data can be integrated for spatial cell-type enrichment analysis. The visualization module allows users to interactively visualize analysis outputs and imaging features. To demonstrate its general applicability, we apply Giotto to a wide range of datasets encompassing diverse technologies and platforms.
2021-03-16Phase identification for dynamic CT enhancements with generative adversarial networkTang Y, Gao R, Lee HH, Chen Y, Gao D, Bermudez C, Bao S, Huo Y, Savoie BV, Landman BATMC-Vanderbilt (Kidney)Purpose: Dynamic contrast-enhanced computed tomography (CT) is widely used to provide dynamic tissue contrast for diagnostic investigation and vascular identification. However, the phase information of contrast injection is typically recorded manually by technicians, which introduces missing or mislabeling. Hence, imaging-based contrast phase identification is appealing, but challenging, due to large variations among different contrast protocols, vascular dynamics, and metabolism, especially for clinically acquired CT scans. The purpose of this study is to perform imaging-based phase identification for dynamic abdominal CT using a proposed adversarial learning framework across five representative contrast phases. Methods: A generative adversarial network (GAN) is proposed as a disentangled representation learning model. To explicitly model different contrast phases, a low dimensional common representation and a class specific code are fused in the hidden layer. Then, the low dimensional features are reconstructed following a discriminator and classifier. 36 350 slices of CT scans from 400 subjects are used to evaluate the proposed method with fivefold cross-validation with splits on subjects. Then, 2216 slices images from 20 independent subjects are employed as independent testing data, which are evaluated using multiclass normalized confusion matrix. Results: The proposed network significantly improved correspondence (0.93) over VGG, ResNet50, StarGAN, and 3DSE with accuracy scores 0.59, 0.62, 0.72, and 0.90, respectively (P < 0.001 Stuart-Maxwell test for normalized multiclass confusion matrix). Conclusion: We show that adversarial learning for discriminator can be benefit for capturing contrast information among phases. The proposed discriminator from the disentangled network achieves promising results.
2021-03-22Islet sympathetic innervation and islet neuropathology in patients with type 1 diabetesCampbell-Thompson M, Butterworth EA, Boatwright JL, Nair MA, Nasif LH, Nasif K, Revell AY, Riva A, Mathews CE, Gerling IC, Schatz DA, Atkinson MATMC-PNNLDysregulation of glucagon secretion in type 1 diabetes (T1D) involves hypersecretion during postprandial states, but insufficient secretion during hypoglycemia. The sympathetic nervous system regulates glucagon secretion. To investigate islet sympathetic innervation in T1D, sympathetic tyrosine hydroxylase (TH) axons were analyzed in control non-diabetic organ donors, non-diabetic islet autoantibody-positive individuals (AAb), and age-matched persons with T1D. Islet TH axon numbers and density were significantly decreased in AAb compared to T1D with no significant differences observed in exocrine TH axon volume or lengths between groups. TH axons were in close approximation to islet α-cells in T1D individuals with long-standing diabetes. Islet RNA-sequencing and qRT-PCR analyses identified significant alterations in noradrenalin degradation, α-adrenergic signaling, cardiac β-adrenergic signaling, catecholamine biosynthesis, and additional neuropathology pathways. The close approximation of TH axons at islet α-cells supports a model for sympathetic efferent neurons directly regulating glucagon secretion. Sympathetic islet innervation and intrinsic adrenergic signaling pathways could be novel targets for improving glucagon secretion in T1D.
2021-04-14CytoTalk: De novo construction of signal transduction networks using single-cell transcriptomic dataYuxuan H, Tao P, Lin G, Kai TTMC-CHOPSingle-cell technology enables study of signal transduction in a complex tissue at unprecedented resolution. We describe CytoTalk for de novo construction of cell type–specific signaling networks using single-cell transcriptomic data. Using an integrated intracellular and intercellular gene network as the input, CytoTalk identifies candidate pathways using the prize-collecting Steiner forest algorithm. Using high-throughput spatial transcriptomic data and single-cell RNA sequencing data with receptor gene perturbation, we demonstrate that CytoTalk has substantial improvement over existing algorithms. To better understand plasticity of signaling networks across tissues and developmental stages, we perform a comparative analysis of signaling networks between macrophages and endothelial cells across human adult and fetal tissues. Our analysis reveals an overall increased plasticity of signaling networks across adult tissues and specific network nodes that contribute to increased plasticity. CytoTalk enables de novo construction of signal transduction pathways and facilitates comparative analysis of these pathways across tissues and conditions.
2021-04-20Quantitative Mass Spectrometry Imaging of Biological SystemsUnsihuay D, Mesa Sanchez D, Laskin J.TTD-PurdueMass spectrometry imaging (MSI) is a powerful, label-free technique that provides detailed maps of hundreds of molecules in complex samples with high sensitivity and subcellular spatial resolution. Accurate quantification in MSI relies on a detailed understanding of matrix effects associated with the ionization process along with evaluation of the extraction efficiency and mass-dependent ion losses occurring in the analysis step. We present a critical summary of approaches developed for quantitative MSI of metabolites, lipids, and proteins in biological tissues and discuss their current and future applications.
2021-04-26Pancreas Optical Clearing and 3-D Microscopy in Health and DiabetesCampbell-Thompson M, Tang SCTMC-PNNLAlthough first described over a hundred years ago, tissue optical clearing is undergoing renewed interest due to numerous advances in optical clearing methods, microscopy systems, and three-dimensional (3-D) image analysis programs. These advances are advantageous for intact mouse tissues or pieces of human tissues because samples sized several millimeters can be studied. Optical clearing methods are particularly useful for studies of the neuroanatomy of the central and peripheral nervous systems and tissue vasculature or lymphatic system. Using examples from solvent- and aqueous-based optical clearing methods, the mouse and human pancreatic structures and networks will be reviewed in 3-D for neuro-insular complexes, parasympathetic ganglia, and adipocyte infiltration as well as lymphatics in diabetes. Optical clearing with multiplex immunofluorescence microscopy provides new opportunities to examine the role of the nervous and circulatory systems in pancreatic and islet functions by defining their neurovascular anatomy in health and diabetes.
2021-04-27Deeper Protein Identification Using Field Asymmetric Ion Mobility Spectrometry in Top-Down ProteomicsGerbasi VR, Melani RD, Abbatiello SE, Belford MW, Huguet R, McGee JP, Dayhoff D, Thomas PM, Kelleher NLRTI-NorthwesternField asymmetric ion mobility spectrometry (FAIMS), when used in proteomics studies, provides superior selectivity and enables more proteins to be identified by providing additional gas-phase separation. Here, we tested the performance of cylindrical FAIMS for the identification and characterization of proteoforms by top-down mass spectrometry of heterogeneous protein mixtures. Combining FAIMS with chromatographic separation resulted in a 62% increase in protein identifications, an 8% increase in proteoform identifications, and an improvement in proteoform identification compared to samples analyzed without FAIMS. In addition, utilization of FAIMS resulted in the identification of proteins encoded by lower-abundance mRNA transcripts. These improvements were attributable, in part, to improved signal-to-noise for proteoforms with similar retention times. Additionally, our results show that the optimal compensation voltage of any given proteoform was correlated with the molecular weight of the analyte. Collectively these results suggest that the addition of FAIMS can enhance top-down proteomics in both discovery and targeted applications.
2021-05-01Highly multiplexed tissue imaging using repeated oligonucleotide exchange reactionKennedy-Darling J, Bhate SS, Hickey JW, Black S, Barlow GL, Vazquez G, Venkataraaman VG, Samusik N, Goltsev Y, Schürch CM, Nolan GPTMC-StanfordMultiparameter tissue imaging enables analysis of cell-cell interactions in situ, the cellular basis for tissue structure, and novel cell types that are spatially restricted, giving clues to biological mechanisms behind tissue homeostasis and disease. Here, we streamlined and simplified the multiplexed imaging method CO-Detection by indEXing (CODEX) by validating 58 unique oligonucleotide barcodes that can be conjugated to antibodies. We showed that barcoded antibodies retained their specificity for staining cognate targets in human tissue. Antibodies were visualized one at a time by adding a fluorescently labeled oligonucleotide complementary to oligonucleotide barcode, imaging, stripping, and repeating this cycle. With this we developed a panel of 46 antibodies that was used to stain five human lymphoid tissues: three tonsils, a spleen, and a LN. To analyze the data produced, an image processing and analysis pipeline was developed that enabled single-cell analysis on the data, including unsupervised clustering, that revealed 31 cell types across all tissues. We compared cell-type compositions within and directly surrounding follicles from the different lymphoid organs and evaluated cell-cell density correlations. This sequential oligonucleotide exchange technique enables a facile imaging of tissues that leverages pre-existing imaging infrastructure to decrease the barriers to broad use of multiplexed imaging.
2021-05-03Supervised Adversarial Alignment of Single-Cell RNA-seq DataGe S, Wang H, Alavi A, Xing E, Bar-Joseph ZHIVE TC-CMUDimensionality reduction is an important first step in the analysis of single-cell RNA-sequencing (scRNA-seq) data. In addition to enabling the visualization of the profiled cells, such representations are used by many downstream analyses methods ranging from pseudo-time reconstruction to clustering to alignment of scRNA-seq data from different experiments, platforms, and laboratories. Both supervised and unsupervised methods have been proposed to reduce the dimension of scRNA-seq. However, all methods to date are sensitive to batch effects. When batches correlate with cell types, as is often the case, their impact can lead to representations that are batch rather than cell-type specific. To overcome this, we developed a domain adversarial neural network model for learning a reduced dimension representation of scRNA-seq data. The adversarial model tries to simultaneously optimize two objectives. The first is the accuracy of cell-type assignment and the second is the inability to distinguish the batch (domain). We tested the method by using the resulting representation to align several different data sets. As we show, by overcoming batch effects our method was able to correctly separate cell types, improving on several prior methods suggested for this task. Analysis of the top features used by the network indicates that by taking the batch impact into account, the reduced representation is much better able to focus on key genes for each cell type.
2021-05-10SpatialDWLS: accurate deconvolution of spatial transcriptomic dataDong R, Yuan GCTTD-Cal TechRecent development of spatial transcriptomic technologies has made it possible to characterize cellular heterogeneity with spatial information. However, the technology often does not have sufficient resolution to distinguish neighboring cell types. Here, we present spatialDWLS, to quantitatively estimate the cell-type composition at each spatial location. We benchmark the performance of spatialDWLS by comparing it with a number of existing deconvolution methods and find that spatialDWLS outperforms the other methods in terms of accuracy and speed. By applying spatialDWLS to a human developmental heart dataset, we observe striking spatial temporal changes of cell-type composition during development.
2021-05-11Spatial multi-omics sequencing for fixed tissue via DBiT-seqSu G, Qin X, Enninful A, Bai Z, Deng Y, Liu Y, Fan RTTD-Yale
This protocol describes the use of the deterministic barcoding in tissue for spatial omics sequencing platform to construct a multi-omics atlas on fixed frozen tissue samples. This approach uses a microfluidic-based method to introduce combinatorial DNA oligo barcodes directly to the cells in a tissue section fixed on a glass slide. This technique does not directly resolve single cells but can achieve a near-single-cell resolution for spatial transcriptomics and spatial analysis of a targeted panel of proteins. For complete details on the use and execution of this protocol, please refer to Liu et al. (2020).
Keywords: G
2021-05-17Identifying signaling genes in spatial single-cell expression dataLi D, Ding J, Bar-Joseph ZHIVE TC-CMUMotivation: Recent technological advances enable the profiling of spatial single-cell expression data. Such data present a unique opportunity to study cell-cell interactions and the signaling genes that mediate them. However, most current methods for the analysis of these data focus on unsupervised descriptive modeling, making it hard to identify key signaling genes and quantitatively assess their impact. Results: We developed a Mixture of Experts for Spatial Signaling genes Identification (MESSI) method to identify active signaling genes within and between cells. The mixture of experts strategy enables MESSI to subdivide cells into subtypes. MESSI relies on multi-task learning using information from neighboring cells to improve the prediction of response genes within a cell. Applying the methods to three spatial single-cell expression datasets, we show that MESSI accurately predicts the levels of response genes, improving upon prior methods and provides useful biological insights about key signaling genes and subtypes of excitatory neuron cells. Availability and implementation: MESSI is available at: https://github.com/doraadong/MESSI. Supplementary information: Supplementary data are available at Bioinformatics online.
2021-05-19Highly Multiplexed Phenotyping of Immunoregulatory Proteins in the Tumor Microenvironment by CODEX Tissue ImagingPhillips D, Schürch CM, Khodadoust MS, Kim YH, Nolan GP, Jiang STMC-StanfordImmunotherapies are revolutionizing cancer treatment by boosting the natural ability of the immune system. In addition to antibodies against traditional checkpoint molecules or their ligands (i.e., CTLA-4, PD-1, and PD-L1), therapies targeting molecules such as ICOS, IDO-1, LAG-3, OX40, TIM-3, and VISTA are currently in clinical trials. To better inform clinical care and the design of therapeutic combination strategies, the co-expression of immunoregulatory proteins on individual immune cells within the tumor microenvironment must be robustly characterized. Highly multiplexed tissue imaging platforms, such as CO-Detection by indEXing (CODEX), are primed to meet this need by enabling >50 markers to be simultaneously analyzed in single-cells on formalin-fixed paraffin-embedded (FFPE) tissue sections. Assembly and validation of antibody panels is particularly challenging, with respect to the specificity of antigen detection and robustness of signal over background. Herein, we report the design, development, optimization, and application of a 56-marker CODEX antibody panel to eight cutaneous T cell lymphoma (CTCL) patient samples. This panel is comprised of structural, tumor, and immune cell markers, including eight immunoregulatory proteins that are approved or currently undergoing clinical trials as immunotherapy targets. Here we provide a resource to enable extensive high-dimensional, spatially resolved characterization of the tissue microenvironment across tumor types and imaging modalities. This framework provides researchers with a readily applicable blueprint to study tumor immunology, tissue architecture, and enable mechanistic insights into immunotherapeutic targets.
2021-05-25RAP-NET: COARSE-TO-FINE MULTI-ORGAN SEGMENTATION WITH SINGLE RANDOM ANATOMICAL PRIORLee HH, Tang Y, Bao S, Abramson RG, Huo Y, Landman BATMC-Vanderbilt (Kidney)Performing coarse-to-fine abdominal multi-organ segmentation facilitates extraction of high-resolution segmentation minimizing the loss of spatial contextual information. However, current coarse-to-refine approaches require a significant number of models to perform single organ segmentation. We propose a coarse-to-fine pipeline RAP-Net, which starts from the extraction of the global prior context of multiple organs from 3D volumes using a low-resolution coarse network, followed by a fine phase that uses a single refined model to segment all abdominal organs instead of multiple organ corresponding models. We combine the anatomical prior with corresponding extracted patches to preserve the anatomical locations and boundary information for performing high-resolution segmentation across all organs in a single model. To train and evaluate our method, a clinical research cohort consisting of 100 patient volumes with 13 organs well-annotated is used. We tested our algorithms with 4-fold cross-validation and computed the Dice score for evaluating the segmentation performance of the 13 organs. Our proposed method using single auto-context outperforms the state-of-the-art on 13 models with an average Dice score 84.58% versus 81.69% (p<0.0001).
2021-05-26Multiomics Imaging Using High-Energy Water Gas Cluster Ion Beam Secondary Ion Mass Spectrometry [(H 2 O) n-GCIB-SIMS] of Frozen-Hydrated Cells and TissueTian H, Sheraz Née Rabbani S, Vickerman JC, Winograd NTTD-Columbia/Penn StateIntegration of multiomics at the single-cell level allows the unambiguous dissecting of phenotypic heterogeneity at different states such as health, disease, and biomedical response. Imaging mass spectrometry holds the promise of being able to measure multiple types of biomolecules in parallel in the same cell. We have explored the possibility of using water gas cluster ion beam secondary ion mass spectrometry [(H2O)n-GCIB-SIMS] as an analytical tool for multiomics assay. (H2O)n-GCIB has been hailed as an ideal ionization source for biological sampling owing to the enhanced chemical sensitivity and reduced matrix effect. Taking advantage of 1 μm spatial resolution by using a high-energy beam system, we have clearly shown the enhancement of multiple intact biomolecules up to a few hundredfold in single cells. Coupled with the cryogenic sample preparation/measurement, the lipids and metabolites were imaged simultaneously within the cellular region, uncovering the pristine chemistry for integrated omics in the same sample. We have demonstrated that double-charged myelin protein fragments and single-charged multiple lipids and metabolites can be localized in the same cells/tissue with a single acquisition. Our exploration has also been extended to the capability of (H2O)n-GCIB in the generation of multiple charged peptides on protein standards. Frozen hydration combined with (H2O)n-GCIB provides the possibility of universal enhancement for the ionization of multiple bio-molecules, including peptides/proteins which has allowed "omics" to become feasible in the same sample using SIMS.
2021-05-31Body Part Regression With Self-SupervisionTang Y, Gao R, Han S, Chen Y, Gao D, Nath V, Bermudez C, Savona MR, Bao S, Lyu I, Huo Y, Landman BATMC-Vanderbilt (Eye/pancreas)Body part regression is a promising new technique that enables content navigation through self-supervised learning. Using this technique, the global quantitative spatial location for each axial view slice is obtained from computed tomography (CT). However, it is challenging to define a unified global coordinate system for body CT scans due to the large variabilities in image resolution, contrasts, sequences, and patient anatomy. Therefore, the widely used supervised learning approach cannot be easily deployed. To address these concerns, we propose an annotation-free method named blind-unsupervised-supervision network (BUSN). The contributions of the work are in four folds: (1) 1030 multi-center CT scans are used in developing BUSN without any manual annotation. (2) the proposed BUSN corrects the predictions from unsupervised learning and uses the corrected results as the new supervision; (3) to improve the consistency of predictions, we propose a novel neighbor message passing (NMP) scheme that is integrated with BUSN as a statistical learning based correction; and (4) we introduce a new pre-processing pipeline with inclusion of the BUSN, which is validated on 3D multi-organ segmentation. The proposed method is trained on 1,030 whole body CT scans (230,650 slices) from five datasets, as well as an independent external validation cohort with 100 scans. From the body part regression results, the proposed BUSN achieved significantly higher median R-squared score (=0.9089) than the state-of-the-art unsupervised method (=0.7153). When introducing BUSN as a preprocessing stage in volumetric segmentation, the proposed pre-processing pipeline using BUSN approach increases the total mean Dice score of the 3D abdominal multi-organ segmentation from 0.7991 to 0.8145.
2021-06-07The emerging landscape of single-molecule protein sequencing technologies.Alfaro JA, Bohländer P, Dai M, Filius M, Howard CJ, van Kooten XF, Ohayon S, Pomorski A, Schmid S, Aksimentiev A, Anslyn EV, Bedran G, Cao C, Chinappi M, Coyaud E, Dekker C, Dittmar G, Drachman N, Eelkema R, Goodlett D, Hentz S, Kalathiya U, Kelleher NL, Kelly RT, Kelman Z, Kim SH, Kuster B, Rodriguez-Larrea D, Lindsay S, Maglia G, Marcotte EM, Marino JP, Masselon C, Mayer M, Samaras P, Sarthak K, Sepiashvili L, Stein D, Wanunu M, Wilhelm M, Yin P, Meller A, Joo CRTI-NorthwesternSingle-cell profiling methods have had a profound impact on the understanding of cellular heterogeneity. While genomes and transcriptomes can be explored at the single-cell level, single-cell profiling of proteomes is not yet established. Here we describe new single-molecule protein sequencing and identification technologies alongside innovations in mass spectrometry that will eventually enable broad sequence coverage in single-cell profiling. These technologies will in turn facilitate biological discovery and open new avenues for ultrasensitive disease diagnostics.
2021-06-15Successive High-Resolution (H(2)O)(n)-GCIB and C(60)-SIMS Imaging Integrates Multi-Omics in Different Cell Types in Breast Cancer TissueTian H, Sparvero LJ, Anthonymuthu TS, Sun WY, Amoscato AA, He RR, Bayır H, Kagan VE, Winograd NTTD-Columbia/Penn StateThe temporo-spatial organization of different cells in the tumor microenvironment (TME) is the key to understanding their complex communication networks and the immune landscape that exists within compromised tissues. Multi-omics profiling of single-interacting cells in the native TME is critical for providing further information regarding the reprograming mechanisms leading to immunosuppression and tumor progression. This requires new technologies for biomolecular profiling of phenotypically heterogeneous cells on the same tissue sample. Here, we developed a new methodology for comprehensive lipidomic and metabolomic profiling of individual cells on frozen-hydrated tissue sections using water gas cluster ion beam secondary ion mass spectrometry ((H2O)n-GCIB-SIMS) (at 1.6 μm beam spot size), followed by profiling cell-type specific lanthanide antibodies on the same tissue section using C60-SIMS (at 1.1 μm beam spot size). We revealed distinct variations of distribution and intensities of >150 key ions (e.g., lipids and important metabolites) in different types of the TME individual cells, such as actively proliferating tumor cells as well as infiltrating immune cells. The demonstrated feasibility of SIMS imaging to integrate the multi-omics profiling in the same tissue section at the single-cell level will lead to new insights into the role of lipid reprogramming and metabolic response in normal regulation or pathogenic discoordination of cell-cell interactions in a variety of tissue microenvironments.
2021-06-24Integrated analysis of multimodal single-cell dataHao Y, Hao S, Andersen-Nissen E, Mauck WM 3rd, Zheng S, Butler A, Lee MJ, Wilk AJ, Darby C, Zager M, Hoffman P, Stoeckius M, Papalexi E, Mimitou EP, Jain J, Srivastava A, Stuart T, Fleming LM, Yeung B, Rogers AJ, McElrath JM, Blish CA, Gottardo R, Smibert P, Satija RHIVE MC-NYGCThe simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce "weighted-nearest neighbor" analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity.
2021-07-02Embryo-scale, single-cell spatial transcriptomicsSrivatsan SR, Regier MC, Barkan E, Franks JM, Packer JS, Grosjean P, Duran M, Saxton S, Ladd JJ, Spielmann M, Lois C, Lampe PD, Shendure J, Stevens KR, Trapnell CTMC-Cal TechSpatial patterns of gene expression manifest at scales ranging from local (e.g., cell-cell interactions) to global (e.g., body axis patterning). However, current spatial transcriptomics methods either average local contexts or are restricted to limited fields of view. Here, we introduce sci-Space, which retains single-cell resolution while resolving spatial heterogeneity at larger scales. Applying sci-Space to developing mouse embryos, we captured approximate spatial coordinates and whole transcriptomes of about 120,000 nuclei. We identify thousands of genes exhibiting anatomically patterned expression, leverage spatial information to annotate cellular subtypes, show that cell types vary substantially in their extent of spatial patterning, and reveal correlations between pseudotime and the migratory patterns of differentiating neurons. Looking forward, we anticipate that sci-Space will facilitate the construction of spatially resolved single-cell atlases of mammalian development.
2021-07-06Editorial: Global excellence in inflammatory diseases: North America 2021Kusner LL, Misra RS, Lucas RTMC-URMCNA
2021-07-07New Interface for Faster Proteoform Analysis: Immunoprecipitation Coupled with SampleStream-Mass SpectrometrySantos Seckler HD, Park HM, Lloyd-Jones CM, Melani RD, Camarillo JM, Wilkins JT, Compton PD, Kelleher NLRTI-NorthwesternDifferent proteoform products of the same gene can exhibit differing associations with health and disease, and their patterns of modifications may offer more precise markers of phenotypic differences between individuals. However, currently employed protein-biomarker discovery and quantification tools, such as bottom-up proteomics and ELISAs, are mostly proteoform-unaware. Moreover, the current throughput for proteoform-level analyses by liquid chromatography mass spectrometry (LCMS) for quantitative top-down proteomics is incompatible with population-level biomarker surveys requiring robust, faster proteoform analysis. To this end, we developed immunoprecipitation coupled to SampleStream mass spectrometry (IP-SampleStream-MS) as a high-throughput, automated technique for the targeted quantification of proteoforms. We applied IP-SampleStream-MS to serum samples of 25 individuals to assess the proteoform abundances of apolipoproteins A-I (ApoA-I) and C-III (ApoC-III). The results for ApoA-I were compared to those of LCMS for these individuals, with IP-SampleStream-MS showing a >7-fold higher throughput with >50% better analytical variation. Proteoform abundances measured by IP-SampleStream-MS correlated strongly to LCMS-based values (R2 = 0.6-0.9) and produced convergent proteoform-to-phenotype associations, namely, the abundance of canonical ApoA-I was associated with lower HDL-C (R = 0.5) and glycated ApoA-I with higher fasting glucose (R = 0.6). We also observed proteoform-to-phenotype associations for ApoC-III, 22 glycoproteoforms of which were characterized in this study. The abundance of ApoC-III modified by a single N-acetyl hexosamine (HexNAc) was associated with indices of obesity, such as BMI, weight, and waist circumference (R ∼ 0.7). These data show IP-SampleStream-MS to be a robust, scalable workflow for high-throughput associations of proteoforms to phenotypes.
2021-07-08Mass spectrometry-based metabolomics: a guide for annotation, quantification and best reporting practicesAlseekh S, Aharoni A, Brotman Y, Contrepois K, D'Auria J, Ewald J, C Ewald J, Fraser PD, Giavalisco P, Hall RD, Heinemann M, Link H, Luo J, Neumann S, Nielsen J, Perez de Souza L, Saito K, Sauer U, Schroeder FC, Schuster S, Siuzdak G, Skirycz A, Sumner LW, Snyder MP, Tang H, Tohge T, Wang Y, Wen W, Wu S, Xu G, Zamboni N, Fernie ARTMC-StanfordMass spectrometry-based metabolomics approaches can enable detection and quantification of many thousands of metabolite features simultaneously. However, compound identification and reliable quantification are greatly complicated owing to the chemical complexity and dynamic range of the metabolome. Simultaneous quantification of many metabolites within complex mixtures can additionally be complicated by ion suppression, fragmentation and the presence of isomers. Here we present guidelines covering sample preparation, replication and randomization, quantification, recovery and recombination, ion suppression and peak misidentification, as a means to enable high-quality reporting of liquid chromatography- and gas chromatography-mass spectrometry-based metabolomics-derived data.
2021-08-02CODEX multiplexed tissue imaging with DNA-conjugated antibodiesBlack S, Phillips D, Hickey JW, Kennedy-Darling J, Venkataraaman VG, Samusik N, Goltsev Y, Schürch CM, Nolan GPTMC-StanfordAdvances in multiplexed imaging technologies have drastically improved our ability to characterize healthy and diseased tissues at the single-cell level. Co-detection by indexing (CODEX) relies on DNA-conjugated antibodies and the cyclic addition and removal of complementary fluorescently labeled DNA probes and has been used so far to simultaneously visualize up to 60 markers in situ. CODEX enables a deep view into the single-cell spatial relationships in tissues and is intended to spur discovery in developmental biology, disease and therapeutic design. Herein, we provide optimized protocols for conjugating purified antibodies to DNA oligonucleotides, validating the conjugation by CODEX staining and executing the CODEX multicycle imaging procedure for both formalin-fixed, paraffin-embedded (FFPE) and fresh-frozen tissues. In addition, we describe basic image processing and data analysis procedures. We apply this approach to an FFPE human tonsil multicycle experiment. The hands-on experimental time for antibody conjugation is ~4.5 h, validation of DNA-conjugated antibodies with CODEX staining takes ~6.5 h and preparation for a CODEX multicycle experiment takes ~8 h. The multicycle imaging and data analysis time depends on the tissue size, number of markers in the panel and computational complexity.
2021-08-05Community-wide hackathons to identify central themes in single-cell multi-omics.Lê Cao KA, Abadi AJ, Davis-Marcisak EF, Hsu L, Arora A, Coullomb A, Deshpande A, Feng Y, Jeganathan P, Loth M, Meng C, Mu W, Pancaldi V, Sankaran K, Righelli D, Singh A, Sodicoff JS, Stein-O'Brien GL, Subramanian A, Welch JD, You Y, Argelaguet R, Carey VJ, Dries R, Greene CS, Holmes S, Love MI, Ritchie ME, Yuan GC, Culhane AC, Fertig E.TTD-Cal TechNA
2021-08-10Immunophenotyping assessment in a COVID-19 cohort (IMPACC): A prospective longitudinal studyIMPACC Manuscript Writing Team; IMPACC Network Steering CommitteeTMC-FloridaThe IMmunoPhenotyping Assessment in a COVID-19 Cohort (IMPACC) is a prospective longitudinal study designed to enroll 1000 hospitalized patients with COVID-19 (NCT04378777). IMPACC collects detailed clinical, laboratory and radiographic data along with longitudinal biologic sampling of blood and respiratory secretions for in depth testing. Clinical and lab data are integrated to identify immunologic, virologic, proteomic, metabolomic and genomic features of COVID-19-related susceptibility, severity and disease progression. The goals of IMPACC are to better understand the contributions of pathogen dynamics and host immune responses to the severity and course of COVID-19 and to generate hypotheses for identification of biomarkers and effective therapeutics, including optimal timing of such interventions. In this report we summarize the IMPACC study design and protocols including clinical criteria and recruitment, multi-site standardized sample collection and processing, virologic and immunologic assays, harmonization of assay protocols, high-level analyses and the data sharing plans.
2021-08-13Strategies for Accurate Cell Type Identification in CODEX Multiplexed Imaging DataHickey JW, Tan Y, Nolan GP, Goltsev YTMC-StanfordMultiplexed imaging is a recently developed and powerful single-cell biology research tool. However, it presents new sources of technical noise that are distinct from other types of single-cell data, necessitating new practices for single-cell multiplexed imaging processing and analysis, particularly regarding cell-type identification. Here we created single-cell multiplexed imaging datasets by performing CODEX on four sections of the human colon (ascending, transverse, descending, and sigmoid) using a panel of 47 oligonucleotide-barcoded antibodies. After cell segmentation, we implemented five different normalization techniques crossed with four unsupervised clustering algorithms, resulting in 20 unique cell-type annotations for the same dataset. We generated two standard annotations: hand-gated cell types and cell types produced by over-clustering with spatial verification. We then compared these annotations at four levels of cell-type granularity. First, increasing cell-type granularity led to decreased labeling accuracy; therefore, subtle phenotype annotations should be avoided at the clustering step. Second, accuracy in cell-type identification varied more with normalization choice than with clustering algorithm. Third, unsupervised clustering better accounted for segmentation noise during cell-type annotation than hand-gating. Fourth, Z-score normalization was generally effective in mitigating the effects of noise from single-cell multiplexed imaging. Variation in cell-type identification will lead to significant differential spatial results such as cellular neighborhood analysis; consequently, we also make recommendations for accurately assigning cell-type labels to CODEX multiplexed imaging.
2021-08-27α-Cyano-4-hydroxycinnamic Acid and Tri-Potassium Citrate Salt Pre-Coated Silicon Nanopost Array Provides Enhanced Lipid Detection for High Spatial Resolution MALDI Imaging Mass SpectrometryDufresne M, Fincher JA, Patterson NH, Schey KL, Norris JL, Caprioli RM, Spraggins JMTMC-Vanderbilt (Eye/pancreas)We have developed a pre-coated substrate for matrix-assisted laser desorption/ionization (MALDI) imaging mass spectrometry (IMS) that enables high spatial resolution mapping of both phospholipids and neutral lipid classes in positive ion mode as metal cation adducts. The MALDI substrates are constructed by depositing a layer of α-cyano-4-hydroxycinnamic acid (CHCA) and potassium salts onto silicon nanopost arrays (NAPA) prior to tissue mounting. The matrix/salt pre-coated NAPA substrate significantly enhances all detected lipid signals allowing lipids to be detected at lower laser energies than bare NAPA. The improved sensitivity at lower laser energy enabled ion images to be generated at 10 μm spatial resolution from rat retinal tissue. Optimization of matrix pre-coated NAPA consisted of testing lithium, sodium, and potassium salts along with various matrices to investigate the increased sensitivity toward lipids for MALDI IMS experiments. It was determined that pre-coating NAPA with CHCA and potassium salts before thaw-mounting of tissue resulted in a signal intensity increase of at least 5.8 ± 0.1-fold for phospholipids and 2.0 ± 0.1-fold for neutral lipids compared to bare NAPA. Pre-coating NAPA with matrix and salt also reduced the necessary laser power to achieve desorption/ionization by ∼35%. This reduced the effective diameter of the ablation area from 13 ± 2 μm down to 8 ± 1 μm, enabling high spatial resolution MALDI IMS. Using pre-coated NAPA with CHCA and potassium salts offers a MALDI IMS substrate with broad molecular coverage of lipids in a single polarity that eliminates the need for extensive sample preparation after sectioning.
2021-09-02Deep learning of gene relationships from single cell time-course expression dataYuan Y, Bar-Joseph ZHIVE TC-CMUTime-course gene-expression data have been widely used to infer regulatory and signaling relationships between genes. Most of the widely used methods for such analysis were developed for bulk expression data. Single cell RNA-Seq (scRNA-Seq) data offer several advantages including the large number of expression profiles available and the ability to focus on individual cells rather than averages. However, the data also raise new computational challenges. Using a novel encoding for scRNA-Seq expression data, we develop deep learning methods for interaction prediction from time-course data. Our methods use a supervised framework which represents the data as 3D tensor and train convolutional and recurrent neural networks for predicting interactions. We tested our time-course deep learning (TDL) models on five different time-series scRNA-Seq datasets. As we show, TDL can accurately identify causal and regulatory gene-gene interactions and can also be used to assign new function to genes. TDL improves on prior methods for the above tasks and can be generally applied to new time-series scRNA-Seq data.
2021-09-02Spatially Resolved Proteomic Analysis of the Lens Extracellular Diffusion BarrierWang Z, Cantrell LS, Schey KLTMC-Vanderbilt (Eye/pancreas)Purpose: The presence of a physical barrier to molecular diffusion through lenticular extracellular space has been repeatedly detected. This extracellular diffusion barrier has been proposed to restrict the movement of solutes into the lens and to direct nutrients into the lens core via the sutures at both poles. The purpose of this study is to characterize the molecular components that could contribute to the formation of this barrier. Methods: Three distinct regions in the bovine lens cortex were captured by laser capture microdissection guided by dye penetration. Proteins were digested by Lys C and trypsin. Mass spectrometry-based proteomic analysis followed by gene ontology and protein interaction network analysis was performed. Results: Dye penetration showed that fiber cells first shrink the extracellular spaces of the broad sides followed by closure of the extracellular space between narrow sides at a normalized lens distance (r/a) of 0.9. Accompanying the closure of extracellular space of the broad sides, dramatic proteomic changes were detected, including upregulation of several cell junctional proteins. AQP0 and its interacting partners, Ezrin and Radixin, were among a few proteins that were upregulated, accompanying the closure of extracellular space of the narrow sides, suggesting a particularly important role for AQP0 in controlling the narrowing of the extracellular spaces between fiber cells. The results also provided important information related to biological processes that occur during fiber cell differentiation such as organelle degradation, cytoskeletal remodeling, and glutathione synthesis. Conclusions: The formation of a lens extracellular diffusion barrier is accompanied by significant membrane and cytoskeletal protein remodeling.
2021-09-03Facile One-Pot Nanoproteomics for Label-Free Proteome Profiling of 50-1000 Mammalian CellsMartin K, Zhang T, Lin TT, Habowski AN, Zhao R, Tsai CF, Chrisler WB, Sontag RL, Orton DJ, Lu YJ, Rodland KD, Yang B, Liu T, Smith RD, Qian WJ, Waterman ML, Wiley HS, Shi TTTD-PNNL/NorthwesternRecent advances in sample preparation enable label-free mass spectrometry (MS)-based proteome profiling of small numbers of mammalian cells. However, specific devices are often required to downscale sample processing volume from the standard 50-200 μL to sub-μL for effective nanoproteomics, which greatly impedes the implementation of current nanoproteomics methods by the proteomics research community. Herein, we report a facile one-pot nanoproteomics method termed SOPs-MS (surfactant-assisted one-pot sample processing at the standard volume coupled with MS) for convenient robust proteome profiling of 50-1000 mammalian cells. Building upon our recent development of SOPs-MS for label-free single-cell proteomics at a low μL volume, we have systematically evaluated its processing volume at 10-200 μL using 100 human cells. The processing volume of 50 μL that is in the range of volume for standard proteomics sample preparation has been selected for easy sample handling with a benchtop micropipette. SOPs-MS allows for reliable label-free quantification of ∼1200-2700 protein groups from 50 to 1000 MCF10A cells. When applied to small subpopulations of mouse colon crypt cells, SOPs-MS has revealed protein signatures between distinct subpopulation cells with identification of ∼1500-2500 protein groups for each subpopulation. SOPs-MS may pave the way for routine deep proteome profiling of small numbers of cells and low-input samples.
2021-09-08Automated biomarker candidate discovery in imaging mass spectrometry data through spatially localized Shapley additive explanationsTideman LEM, Migas LG, Djambazova KV, Patterson NH, Caprioli RM, Spraggins JM, Van de Plas RTMC-Vanderbilt (Eye/pancreas)The search for molecular species that are differentially expressed between biological states is an important step towards discovering promising biomarker candidates. In imaging mass spectrometry (IMS), performing this search manually is often impractical due to the large size and high-dimensionality of IMS datasets. Instead, we propose an interpretable machine learning workflow that automatically identifies biomarker candidates by their mass-to-charge ratios, and that quantitatively estimates their relevance to recognizing a given biological class using Shapley additive explanations (SHAP). The task of biomarker candidate discovery is translated into a feature ranking problem: given a classification model that assigns pixels to different biological classes on the basis of their mass spectra, the molecular species that the model uses as features are ranked in descending order of relative predictive importance such that the top-ranking features have a higher likelihood of being useful biomarkers. Besides providing the user with an experiment-wide measure of a molecular species' biomarker potential, our workflow delivers spatially localized explanations of the classification model's decision-making process in the form of a novel representation called SHAP maps. SHAP maps deliver insight into the spatial specificity of biomarker candidates by highlighting in which regions of the tissue sample each feature provides discriminative information and in which regions it does not. SHAP maps also enable one to determine whether the relationship between a biomarker candidate and a biological state of interest is correlative or anticorrelative. Our automated approach to estimating a molecular species' potential for characterizing a user-provided biological class, combined with the untargeted and multiplexed nature of IMS, allows for the rapid screening of thousands of molecular species and the obtention of a broader biomarker candidate shortlist than would be possible through targeted manual assessment. Our biomarker candidate discovery workflow is demonstrated on mouse-pup and rat kidney case studies.
2021-09-30Characteristics of p.Gln368Ter Myocilin Variant and Influence of Polygenic Risk on Glaucoma Penetrance in the UK BiobankZebardast N, Sekimitsu S, Wang J, Elze T, Gharahkhani P, Cole BS, Lin MM, Segrè AV, Wiggs JLDP-HarvardPurpose: MYOC (myocilin) mutations account for 3% to 5% of primary open-angle glaucoma (POAG) cases. We aimed to understand the true population-wide penetrance and characteristics of glaucoma among individuals with the most common MYOC variant (p.Gln368Ter) and the impact of a POAG polygenic risk score (PRS) in this population. Design: Cross-sectional population-based study. Participants: Individuals with the p.Gln368Ter variant among 77 959 UK Biobank participants with fundus photographs (FPs). Methods: A genome-wide POAG PRS was computed, and 2 masked graders reviewed FPs for disc-defined glaucoma (DDG). Main outcome measures: Penetrance of glaucoma. Results: Two hundred individuals carried the p.Gln368Ter heterozygous genotype, and 177 had gradable FPs. One hundred thirty-two showed no evidence of glaucoma, 45 (25.4%) had probable/definite glaucoma in at least 1 eye, and 19 (10.7%) had bilateral glaucoma. No differences were found in age, race/ethnicity, or gender among groups (P > 0.05). Of those with DDG, 31% self-reported or had International Classification of Diseases codes for glaucoma, whereas 69% were undiagnosed. Those with DDG had higher medication-adjusted cornea-corrected intraocular pressure (IOPcc) (P < 0.001) vs. those without glaucoma. This difference in IOPcc was larger in those with DDG with a prior glaucoma diagnosis versus those not diagnosed (P < 0.001). Most p.Gln368Ter carriers showed IOP in the normal range (≤21 mmHg), although this proportion was lower in those with DDG (P < 0.02) and those with prior glaucoma diagnosis (P < 0.03). Prevalence of DDG increased with each decile of POAG PRS. Individuals with DDG demonstrated significantly higher PRS compared with those without glaucoma (0.37 ± 0.97 vs. 0.01 ± 0.90; P = 0.03). Of those with DDG, individuals with a prior diagnosis of glaucoma had higher PRS compared with undiagnosed individuals (1.31 ± 0.64 vs. 0.00 ± 0.81; P < 0.001) and 27.5 times (95% confidence interval, 2.5-306.6) adjusted odds of being in the top decile of PRS for POAG. Conclusions: One in 4 individuals with the MYOC p.Gln368Ter mutation demonstrated evidence of glaucoma, a substantially higher penetrance than previously estimated, with 69% of cases undetected. A large portion of p.Gln368Ter carriers, including those with DDG, have IOP in the normal range, despite similar age. Polygenic risk score increases disease penetrance and severity, supporting the usefulness of PRS in risk stratification among MYOC p.Gln368Ter carriers.
2021-09-30Acceleration of age-induced proteolysis in the guinea pig lens nucleus by in vivo exposure to hyperbaric oxygen: A mass spectrometry analysisGiblin FJ, Anderson DMG, Han J, Rose KL, Wang Z, Schey KLTMC-Vanderbilt (Eye/pancreas)Hyperbaric oxygen (HBO) treatment of animals or ocular lenses in culture recapitulates many molecular changes observed in human age-related nuclear cataract. The guinea pig HBO model has been one of the best examples of such treatment leading to dose-dependent development of lens nuclear opacities. In this study, complimentary mass spectrometry methods were employed to examine protein truncation after HBO treatment of aged guinea pigs. Quantitative liquid chromatography-mass spectrometry (LC-MS) analysis of the membrane fraction of guinea pig lenses showed statistically significant increases in aquaporin-0 (AQP0) C-terminal truncation, consistent with previous reports of accelerated loss of membrane and cytoskeletal proteins. In addition, imaging mass spectrometry (IMS) analysis spatially mapped the acceleration of age-related αA-crystallin truncation in the lens nucleus. The truncation sites in αA-crystallin closely match those observed in human lenses with age. Taken together, our results suggest that HBO accelerates the normal lens aging process and leads to nuclear cataract.
2021-10-04Computational tools for analyzing single-cell data in pluripotent cell differentiation studiesDing J, Alavi A, Ebrahimkhani MR, Bar-Joseph ZHIVE TC-CMUSingle-cell technologies are revolutionizing the ability of researchers to infer the causes and results of biological processes. Although several studies of pluripotent cell differentiation have recently utilized single-cell sequencing data, other aspects related to the optimization of differentiation protocols, their validation, robustness, and usage are still not taking full advantage of single-cell technologies. In this review, we focus on computational approaches for the analysis of single-cell omics and imaging data and discuss their use to address many of the major challenges involved in the development, validation, and use of cells obtained from pluripotent cell differentiation.
2021-10-14Editorial: Footprints of Immune Cells in the Type 1 Diabetic PancreasBrusko TM, Mallone R, Rodriguez-Calvo TTMC-FloridaNA
2021-10-273D virtual reality vs. 2D desktop registration user interface comparisonBueckle A, Buehling K, Shih PC, Börner KHIVE MC-IUWorking with organs and extracted tissue blocks is an essential task in many medical surgery and anatomy environments. In order to prepare specimens from human donors for further analysis, wet-bench workers must properly dissect human tissue and collect metadata for downstream analysis, including information about the spatial origin of tissue. The Registration User Interface (RUI) was developed to allow stakeholders in the Human Biomolecular Atlas Program (HuBMAP) to register tissue blocks-i.e., to record the size, position, and orientation of human tissue data with regard to reference organs. The RUI has been used by tissue mapping centers across the HuBMAP consortium to register a total of 45 kidney, spleen, and colon tissue blocks, with planned support for 17 organs in the near future. In this paper, we compare three setups for registering one 3D tissue block object to another 3D reference organ (target) object. The first setup is a 2D Desktop implementation featuring a traditional screen, mouse, and keyboard interface. The remaining setups are both virtual reality (VR) versions of the RUI: VR Tabletop, where users sit at a physical desk which is replicated in virtual space; VR Standup, where users stand upright while performing their tasks. All three setups were implemented using the Unity game engine. We then ran a user study for these three setups involving 42 human subjects completing 14 increasingly difficult and then 30 identical tasks in sequence and reporting position accuracy, rotation accuracy, completion time, and satisfaction. All study materials were made available in support of future study replication, alongside videos documenting our setups. We found that while VR Tabletop and VR Standup users are about three times as fast and about a third more accurate in terms of rotation than 2D Desktop users (for the sequence of 30 identical tasks), there are no significant differences between the three setups for position accuracy when normalized by the height of the virtual kidney across setups. When extrapolating from the 2D Desktop setup with a 113-mm-tall kidney, the absolute performance values for the 2D Desktop version (22.6 seconds per task, 5.88 degrees rotation, and 1.32 mm position accuracy after 8.3 tasks in the series of 30 identical tasks) confirm that the 2D Desktop interface is well-suited for allowing users in HuBMAP to register tissue blocks at a speed and accuracy that meets the needs of experts performing tissue dissection. In addition, the 2D Desktop setup is cheaper, easier to learn, and more practical for wet-bench environments than the VR setups.
2021-10-28Deep learning and alignment of spatially resolved single-cell transcriptomes with TangramBiancalani T, Scalia G, Buffoni L, Avasthi R, Lu Z, Sanger A, Tokcan N, Vanderburg CR, Segerstolpe Å, Zhang M, Avraham-Davidi I, Vickovic S, Nitzan M, Ma S, Subramanian A, Lipinski M, Buenrostro J, Brown NB, Fanelli D, Zhuang X, Macosko EZ, Regev AHIVE MC-NYGCCharting an organs' biological atlas requires us to spatially resolve the entire single-cell transcriptome, and to relate such cellular features to the anatomical scale. Single-cell and single-nucleus RNA-seq (sc/snRNA-seq) can profile cells comprehensively, but lose spatial information. Spatial transcriptomics allows for spatial measurements, but at lower resolution and with limited sensitivity. Targeted in situ technologies solve both issues, but are limited in gene throughput. To overcome these limitations we present Tangram, a method that aligns sc/snRNA-seq data to various forms of spatial data collected from the same region, including MERFISH, STARmap, smFISH, Spatial Transcriptomics (Visium) and histological images. Tangram can map any type of sc/snRNA-seq data, including multimodal data such as those from SHARE-seq, which we used to reveal spatial patterns of chromatin accessibility. We demonstrate Tangram on healthy mouse brain tissue, by reconstructing a genome-wide anatomically integrated spatial map at single-cell resolution of the visual and somatomotor areas.
2021-10-31Advances in spatial transcriptomic data analysisDries R, Chen J, Del Rossi N, Khan MM, Sistig A, Yuan GC.TTD-Cal TechSpatial transcriptomics is a rapidly growing field that promises to comprehensively characterize tissue organization and architecture at the single-cell or subcellular resolution. Such information provides a solid foundation for mechanistic understanding of many biological processes in both health and disease that cannot be obtained by using traditional technologies. The development of computational methods plays important roles in extracting biological signals from raw data. Various approaches have been developed to overcome technology-specific limitations such as spatial resolution, gene coverage, sensitivity, and technical biases. Downstream analysis tools formulate spatial organization and cell–cell communications as quantifiable properties, and provide algorithms to derive such properties. Integrative pipelines further assemble multiple tools in one package, allowing biologists to conveniently analyze data from beginning to end. In this review, we summarize the state of the art of spatial transcriptomic data analysis methods and pipelines, and discuss how they operate on different technological platforms.
2021-11-01In-depth triacylglycerol profiling using MS3 Q-Trap mass spectrometry.Cabruja M, Priotti J, Domizi P, Papsdorf K, Kroetz DL, Brunet A, Contrepois K, Snyder MPTMC-StanfordTotal triacylglycerol (TAG) level is a key clinical marker of metabolic and cardiovascular diseases. However, the roles of individual TAGs have not been thoroughly explored in part due to their extreme structural complexity. We present a targeted mass spectrometry-based method combining multiple reaction monitoring (MRM) and multiple stage mass spectrometry (MS3) for the comprehensive qualitative and semiquantitative profiling of TAGs. This method referred as TriP-MS3 - triacylglycerol profiling using MS3 - screens for more than 6,700 TAG species in a fully automated fashion. TriP-MS3 demonstrated excellent reproducibility (median interday CV ∼ 0.15) and linearity (median R2 = 0.978) and detected 285 individual TAG species in human plasma. The semiquantitative accuracy of the method was validated by comparison with a state-of-the-art reverse phase liquid chromatography (RPLC)-MS (R2 = 0.83), which is the most commonly used approach for TAGs profiling. Finally, we demonstrate the utility and the versatility of the method by characterizing the effects of a fatty acid desaturase inhibitor on TAG profiles in vitro and by profiling TAGs in Caenorhabditis elegans.
2021-11-01Single-cell chromatin state analysis with SignacStuart T, Srivastava A, Madad S, Lareau CA, Satija RHIVE MC-NYGCThe recent development of experimental methods for measuring chromatin state at single-cell resolution has created a need for computational tools capable of analyzing these datasets. Here we developed Signac, a comprehensive toolkit for the analysis of single-cell chromatin data. Signac enables an end-to-end analysis of single-cell chromatin data, including peak calling, quantification, quality control, dimension reduction, clustering, integration with single-cell gene expression datasets, DNA motif analysis and interactive visualization. Through its seamless compatibility with the Seurat package, Signac facilitates the analysis of diverse multimodal single-cell chromatin data, including datasets that co-assay DNA accessibility with gene expression, protein abundance and mitochondrial genotype. We demonstrate scaling of the Signac framework to analyze datasets containing over 700,000 cells.
2021-11-01Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-readsShafin K, Pesout T, Chang PC, Nattestad M, Kolesnikov A, Goel S, Baid G, Kolmogorov M, Eizenga JM, Miga KH, Carnevali P, Jain M, Carroll A, Paten BHIVE TC-CMULong-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).
2021-11-08Cell type ontologies of the Human Cell AtlasOsumi-Sutherland D, Xu C, Keays M, Levine AP, Kharchenko PV, Regev A, Lein E, Teichmann SAHIVE TC-CMUMassive single-cell profiling efforts have accelerated our discovery of the cellular composition of the human body while at the same time raising the need to formalize this new knowledge. Here, we discuss current efforts to harmonize and integrate different sources of annotations of cell types and states into a reference cell ontology. We illustrate with examples how a unified ontology can consolidate and advance our understanding of cell types across scientific communities and biological domains.
2021-11-08Anatomical structures, cell types and biomarkers of the Human Reference AtlasBörner K, Teichmann SA, Quardokus EM, Gee JC, Browne K, Osumi-Sutherland D, Herr BW 2nd, Bueckle A, Paul H, Haniffa M, Jardine L, Bernard A, Ding SL, Miller JA, Lin S, Halushka MK, Boppana A, Longacre TA, Hickey J, Lin Y, Valerius MT, He Y, Pryhuber G, Sun X, Jorgensen M, Radtke AJ, Wasserfall C, Ginty F, Ho J, Sunshine J, Beuschel RT, Brusko M, Lee S, Malhotra R, Jain S, Weber GHIVE MC-IUThe Human Reference Atlas (HRA) aims to map all of the cells of the human body to advance biomedical research and clinical practice. This Perspective presents collaborative work by members of 16 international consortia on two essential and interlinked parts of the HRA: (1) three-dimensional representations of anatomy that are linked to (2) tables that name and interlink major anatomical structures, cell types, plus biomarkers (ASCT+B). We discuss four examples that demonstrate the practical utility of the HRA.
2021-11-11Towards inferring nanopore sequencing ionic currents from nucleotide chemical structuresDing H, Anastopoulos I, Bailey AD 4th, Stuart J, Paten BHIVE TC-CMUThe characteristic ionic currents of nucleotide kmers are commonly used in analyzing nanopore sequencing readouts. We present a graph convolutional network-based deep learning framework for predicting kmer characteristic ionic currents from corresponding chemical structures. We show such a framework can generalize the chemical information of the 5-methyl group from thymine to cytosine by correctly predicting 5-methylcytosine-containing DNA 6mers, thus shedding light on the de novo detection of nucleotide modifications.
2021-11-18Immune cell topography predicts response to PD-1 blockade in cutaneous T cell lymphomaPhillips D, Matusiak M, Gutierrez BR, Bhate SS, Barlow GL, Jiang S, Demeter J, Smythe KS, Pierce RH, Fling SP, Ramchurren N, Cheever MA, Goltsev Y, West RB, Khodadoust MS, Kim YH, Schürch CM, Nolan GPTMC-StanfordCutaneous T cell lymphomas (CTCL) are rare but aggressive cancers without effective treatments. While a subset of patients derive benefit from PD-1 blockade, there is a critically unmet need for predictive biomarkers of response. Herein, we perform CODEX multiplexed tissue imaging and RNA sequencing on 70 tumor regions from 14 advanced CTCL patients enrolled in a pembrolizumab clinical trial (NCT02243579). We find no differences in the frequencies of immune or tumor cells between responders and non-responders. Instead, we identify topographical differences between effector PD-1+ CD4+ T cells, tumor cells, and immunosuppressive Tregs, from which we derive a spatial biomarker, termed the SpatialScore, that correlates strongly with pembrolizumab response in CTCL. The SpatialScore coincides with differences in the functional immune state of the tumor microenvironment, T cell function, and tumor cell-specific chemokine recruitment and is validated using a simplified, clinically accessible tissue imaging platform. Collectively, these results provide a paradigm for investigating the spatial balance of effector and suppressive T cell activity and broadly leveraging this biomarker approach to inform the clinical use of immunotherapies.
2021-11-23Scalable dual-omics profiling with single-nucleus chromatin accessibility and mRNA expression sequencing 2 (SNARE-seq2)Plongthongkum N, Diep D, Chen S, Lake BB, Zhang K.TMC-UCSDComprehensive characterization of cellular heterogeneity and the underlying regulatory landscapes of tissues and organs requires a highly robust and scalable method to acquire matched RNA and chromatin accessibility profiles on the same cells. Here, we describe a single-nucleus chromatin accessibility and mRNA expression sequencing 2 (SNARE-seq2) assay, implemented with cellular combinatorial indexing. This method involves tagmentation within permeabilized and fixed single-nucleus isolates to capture accessible chromatin (AC) regions, followed by the capture and reverse transcription of RNA transcripts. Through combinatorial split pool ligations, cDNA and AC within each single nucleus become appended with a common cell barcode combination. The captured cDNA and AC are then co-amplified before splitting and enrichment into single-nucleus RNA and single-nucleus AC sequencing libraries. This protocol is compatible with both nuclei and whole cells and can be completed in 3.5 d. SNARE-seq2 permits robust generation of high-quality, joint single-cell RNA and AC sequencing libraries from hundreds of thousands of single cells per experiment.
2021-11-26Self-supervised clustering of mass spectrometry imaging data using contrastive learningHu H, Bindu JP, Laskin JTTD-PurdueMass spectrometry imaging (MSI) is widely used for the label-free molecular mapping of biological samples. The identification of co-localized molecules in MSI data is crucial to the understanding of biochemical pathways. One of key challenges in molecular colocalization is that complex MSI data are too large for manual annotation but too small for training deep neural networks. Herein, we introduce a self-supervised clustering approach based on contrastive learning, which shows an excellent performance in clustering of MSI data. We train a deep convolutional neural network (CNN) using MSI data from a single experiment without manual annotations to effectively learn high-level spatial features from ion images and classify them based on molecular colocalizations. We demonstrate that contrastive learning generates ion image representations that form well-resolved clusters. Subsequent self-labeling is used to fine-tune both the CNN encoder and linear classifier based on confidently classified ion images. This new approach enables autonomous and high-throughput identification of co-localized species in MSI data, which will dramatically expand the application of spatial lipidomics, metabolomics, and proteomics in biological research.
2021-12-01A Pilot Study of Urine Proteomics in Covid-19-associated Acute Kidney InjuryYe Y, Swensen AC, Wang Y, Kaushal M, Salamon D, Knoten A, NicoraCD, Marks L, Gaut JP, Vijayan A, Orton DJ, Mudd PA, Parikh CR, Qian WJ, O’Halloran JA, PiehowskiPD, Jain STTD-PurdueAcute kidney injury (AKI) is a major complication associated with COVID-19 and occurs in up to 76% of intensive care unit patients 1, 2. The mortality rate of COVID-19 patients who developed AKI (COVID-AKI) is more than 10 times higher than those who did not. While candidate AKI markers exist, the etiology of COVID-AKI is multifactorial requiring agnostic approaches for identification of analytes early in hospital course to provide insights into biomarkers and mechanisms associated with COVID-AKI and COVID-19 infection. Research on COVID-19-associated effects on the urinary proteome is limited, and kidney dysfunction has not been reported 3. Approximately 70% of the proteins detected in urine are produced in the kidney with a significant amount filtered from blood. We hypothesize that the changes in protein abundances in urine could lead to the discovery of protein markers associated with COVID-19 or COVID-AKI and provide mechanistic insights to improve understanding. Results and Discussion We analyzed urine samples from 14 participants (6 COVID-AKI, 3 COVID-NoAKI and 5 NoCOVID-NoAKI) (Figure 1A, Table S1). To account for the large variation in protein content across urine specimens we utilized a bicinchoninic acid (BCA) to measure peptide concentration after protein digestion. All peptide samples are then normalized to the same concentration prior to analysis to allow for relative quantitation of differences in the urinary proteome. The Urine proteome of COVID-AKI After confirming the quality of urine for analyte discovery (Table S2, Figure S1), we examined if underlying variance could distinguish between AKI+ (6 COVID-AKI) from other samples without AKI (8 AKI-); all AKI samples were from COVID-19 patients. The first two principal components accounted for 42.1% of the variance, and clearly separated the two groups (Figure 1B); thereby
2021-12-06geneBasis: an iterative approach for unsupervised selection of targeted gene panels from scRNA-seqMissarova A, Jain J, Butler A, Ghazanfar S, Stuart T, Brusko M, Wasserfall C, Nick H, Brusko T, Atkinson M, Satija R, Marioni JCHIVE MC-NYGCscRNA-seq datasets are increasingly used to identify gene panels that can be probed using alternative technologies, such as spatial transcriptomics, where choosing the best subset of genes is vital. Existing methods are limited by a reliance on pre-existing cell type labels or by difficulties in identifying markers of rare cells. We introduce an iterative approach, geneBasis, for selecting an optimal gene panel, where each newly added gene captures the maximum distance between the true manifold and the manifold constructed using the currently selected gene panel. Our approach outperforms existing strategies and can resolve cell types and subtle cell state differences.
2021-12-14Cross-Laboratory Standardization of Preclinical Lipidomics Using Differential Mobility Spectrometry and Multiple Reaction MonitoringGhorasaini M, Mohammed Y, Adamski J, Bettcher L, Bowden JA, Cabruja M, Contrepois K, Ellenberger M, Gajera B, Haid M, Hornburg D, Hunter C, Jones CM, Klein T, Mayboroda O, Mirzaian M, Moaddel R, Ferrucci L, Lovett J, Nazir K, Pearson M, Ubhi BK, Raftery D, Riols F, Sayers R, Sijbrands EJG, Snyder MP, Su B, Velagapudi V, Williams KJ, de Rijke YB, Giera MTMC-StanfordModern biomarker and translational research as well as personalized health care studies rely heavily on powerful omics' technologies, including metabolomics and lipidomics. However, to translate metabolomics and lipidomics discoveries into a high-throughput clinical setting, standardization is of utmost importance. Here, we compared and benchmarked a quantitative lipidomics platform. The employed Lipidyzer platform is based on lipid class separation by means of differential mobility spectrometry with subsequent multiple reaction monitoring. Quantitation is achieved by the use of 54 deuterated internal standards and an automated informatics approach. We investigated the platform performance across nine laboratories using NIST SRM 1950-Metabolites in Frozen Human Plasma, and three NIST Candidate Reference Materials 8231-Frozen Human Plasma Suite for Metabolomics (high triglyceride, diabetic, and African-American plasma). In addition, we comparatively analyzed 59 plasma samples from individuals with familial hypercholesterolemia from a clinical cohort study. We provide evidence that the more practical methyl-tert-butyl ether extraction outperforms the classic Bligh and Dyer approach and compare our results with two previously published ring trials. In summary, we present standardized lipidomics protocols, allowing for the highly reproducible analysis of several hundred human plasma lipids, and present detailed molecular information for potentially disease relevant and ethnicity-related materials.
2021-12-17Pangenomics enables genotyping of known structural variants in 5202 diverse genomesSirén J, Monlong J, Chang X, Novak AM, Eizenga JM, Markello C, Sibbesen JA, Hickey G, Chang PC, Carroll A, Gupta N, Gabriel S, Blackwell TW, Ratan A, Taylor KD, Rich SS, Rotter JI, Haussler D, Garrison E, Paten BHIVE TC-CMUWe introduce Giraffe, a pangenome short-read mapper that can efficiently map to a collection of haplotypes threaded through a sequence graph. Giraffe maps sequencing reads to thousands of human genomes at a speed comparable to that of standard methods mapping to a single reference genome. The increased mapping accuracy enables downstream improvements in genome-wide genotyping pipelines for both small variants and larger structural variants. We used Giraffe to genotype 167,000 structural variants, discovered in long-read studies, in 5202 diverse human genomes that were sequenced using short reads. We conclude that pangenomics facilitates a more comprehensive characterization of variation and, as a result, has the potential to improve many genomic analyses.
2021-12-22Tissue fixation effects on human retinal lipid analysis by MALDI imaging and LC-MS/MS technologiesKotnala A, Anderson DMG, Patterson NH, Cantrell LS, Messinger JD, Curcio CA, Schey KLTMC-Vanderbilt (Eye/pancreas)Imaging mass spectrometry (IMS) allows the location and abundance of lipids to be mapped across tissue sections of human retina. For reproducible and accurate information, sample preparation methods need to be optimized. Paraformaldehyde fixation of a delicate multilayer structure like human retina facilitates the preservation of tissue morphology by forming methylene bridge crosslinks between formaldehyde and amine/thiols in biomolecules; however, retina sections analyzed by IMS are typically fresh-frozen. To determine if clinically significant inferences could be reliably based on fixed tissue, we evaluated the effect of fixation on analyte detection, spatial localization, and introduction of artifactual signals. Hence, we assessed the molecular identity of lipids generated by matrix-assisted laser desorption ionization (MALDI-IMS) and liquid chromatography coupled tandem mass spectrometry (LC-MS/MS) for fixed and fresh-frozen retina tissues in positive and negative ion modes. Based on MALDI-IMS analysis, more lipid signals were observed in fixed compared with fresh-frozen retina. More potassium adducts were observed in fresh-frozen tissues than fixed as the fixation process caused displacement of potassium adducts to protonated and sodiated species in ion positive ion mode. LC-MS/MS analysis revealed an overall decrease in lipid signals due to fixation that reduced glycerophospholipids and glycerolipids and conserved most sphingolipids and cholesteryl esters. The high quality and reproducible information from untargeted lipidomics analysis of fixed retina informs on all major lipid classes, similar to fresh-frozen retina, and serves as a steppingstone towards understanding of lipid alterations in retinal diseases.
2022-01-01Magnetic resonance linear accelerator technology and adaptive radiation therapy: An overview for cliniciansHall WA, Paulson E, Li XA, Erickson B, Schultz C, Tree A, Awan M, Low DA, McDonald BA, Salzillo T, Glide-Hurst CK, Kishan AU, Fuller CDHIVE IEC-PSCRadiation therapy (RT) continues to play an important role in the treatment of cancer. Adaptive RT (ART) is a novel method through which RT treatments are evolving. With the ART approach, computed tomography or magnetic resonance (MR) images are obtained as part of the treatment delivery process. This enables the adaptation of the irradiated volume to account for changes in organ and/or tumor position, movement, size, or shape that may occur over the course of treatment. The advantages and challenges of ART maybe somewhat abstract to oncologists and clinicians outside of the specialty of radiation oncology. ART is positioned to affect many different types of cancer. There is a wide spectrum of hypothesized benefits, from small toxicity improvements to meaningful gains in overall survival. The use and application of this novel technology should be understood by the oncologic community at large, such that it can be appropriately contextualized within the landscape of cancer therapies. Likewise, the need to test these advances is pressing. MR-guided ART (MRgART) is an emerging, extended modality of ART that expands upon and further advances the capabilities of ART. MRgART presents unique opportunities to iteratively improve adaptive image guidance. However, although the MRgART adaptive process advances ART to previously unattained levels, it can be more expensive, time-consuming, and complex. In this review, the authors present an overview for clinicians describing the process of ART and specifically MRgART.
2022-01-03Highly multiplexed immunofluorescence of the human kidney using co-detection by indexingNeumann EK, Patterson NH, Rivera ES, Allen JL, Brewer M, deCaestecker MP, Caprioli RM, Fogo AB, Spraggins JM.TMC-Vanderbilt (Kidney)The human kidney is composed of many cell types that vary in their abundance and distribution from normal to diseased organ. As these cell types perform unique and essential functions, it is important to confidently label each within a single tissue to accurately assess tissue architecture and microenvironments. Towards this goal, we demonstrate the use of co-detection by indexing (CODEX) multiplexed immunofluorescence for visualizing 23 antigens within the human kidney. Using CODEX, many of the major cell types and substructures, such as collecting ducts, glomeruli, and thick ascending limb, were visualized within a single tissue section. Of these antibodies, 19 were conjugated in-house, demonstrating the flexibility and utility of this approach for studying the human kidney using custom and commercially available antibodies. We performed a pilot study that compared both fresh frozen and formalin-fixed paraffin-embedded healthy non-neoplastic and diabetic nephropathy kidney tissues. The largest cellular differences between the two groups was observed in cells labeled with aquaporin 1, cytokeratin 7, and α-smooth muscle actin. Thus, our data show the power of CODEX multiplexed immunofluorescence for surveying the cellular diversity of the human kidney and the potential for applications within pathology, histology, and building anatomical atlases.
2022-01-10Integration of spatial and single-cell transcriptomic data elucidates mouse organogenesisLohoff T, Ghazanfar S, Missarova A, Koulena N, Pierson N, Griffiths JA, Bardot ES, Eng CL, Tyser RCV, Argelaguet R, Guibentif C, Srinivas S, Briscoe J, Simons BD, Hadjantonakis AK, Göttgens B, Reik W, Nichols J, Cai L, Marioni JCHIVE MC-NYGCMolecular profiling of single cells has advanced our knowledge of the molecular basis of development. However, current approaches mostly rely on dissociating cells from tissues, thereby losing the crucial spatial context of regulatory processes. Here, we apply an image-based single-cell transcriptomics method, sequential fluorescence in situ hybridization (seqFISH), to detect mRNAs for 387 target genes in tissue sections of mouse embryos at the 8-12 somite stage. By integrating spatial context and multiplexed transcriptional measurements with two single-cell transcriptome atlases, we characterize cell types across the embryo and demonstrate that spatially resolved expression of genes not profiled by seqFISH can be imputed. We use this high-resolution spatial map to characterize fundamental steps in the patterning of the midbrain-hindbrain boundary (MHB) and the developing gut tube. We uncover axes of cell differentiation that are not apparent from single-cell RNA-sequencing (scRNA-seq) data, such as early dorsal-ventral separation of esophageal and tracheal progenitor populations in the gut tube. Our method provides an approach for studying cell fate decisions in complex tissues and development.
2022-01-10A census of the lung: CellCards from LungMAPSun X, Perl AK, Li R, Bell SM, Sajti E, Kalinichenko VV, Kalin TV, Misra RS, Deshmukh H, Clair G, Kyle J, Crotty Alexander LE, Masso-Silva JA, Kitzmiller JA, Wikenheiser-Brokamp KA, Deutsch G, Guo M, Du Y, Morley MP, Valdez MJ, Yu HV, Jin K, Bardes EE, Zepp JA, Neithamer T, Basil MC, Zacharias WJ, Verheyden J, Young R, Bandyopadhyay G, Lin S, Ansong C, Adkins J, Salomonis N, Aronow BJ, Xu Y, Pryhuber G, Whitsett J, Morrisey EE, NHLBI LungMAP ConsortiumTMC-UCSDThe human lung plays vital roles in respiration, host defense, and basic physiology. Recent technological advancements such as single-cell RNA sequencing and genetic lineage tracing have revealed novel cell types and enriched functional properties of existing cell types in lung. The time has come to take a new census. Initiated by members of the NHLBI-funded LungMAP Consortium and aided by experts in the lung biology community, we synthesized current data into a comprehensive and practical cellular census of the lung. Identities of cell types in the normal lung are captured in individual cell cards with delineation of function, markers, developmental lineages, heterogeneity, regenerative potential, disease links, and key experimental tools. This publication will serve as the starting point of a live, up-to-date guide for lung research at https://www.lungmap.net/cell-cards/. We hope that Lung CellCards will promote the community-wide effort to establish, maintain, and restore respiratory health.
2022-01-18Comparison and evaluation of statistical error models for scRNA-seqChoudhary S, Satija R.HIVE MC-NYGCBackground: Heterogeneity in single-cell RNA-seq (scRNA-seq) data is driven by multiple sources, including biological variation in cellular state as well as technical variation introduced during experimental processing. Deconvolving these effects is a key challenge for preprocessing workflows. Recent work has demonstrated the importance and utility of count models for scRNA-seq analysis, but there is a lack of consensus on which statistical distributions and parameter settings are appropriate. Results: Here, we analyze 59 scRNA-seq datasets that span a wide range of technologies, systems, and sequencing depths in order to evaluate the performance of different error models. We find that while a Poisson error model appears appropriate for sparse datasets, we observe clear evidence of overdispersion for genes with sufficient sequencing depth in all biological systems, necessitating the use of a negative binomial model. Moreover, we find that the degree of overdispersion varies widely across datasets, systems, and gene abundances, and argues for a data-driven approach for parameter estimation. Conclusions: Based on these analyses, we provide a set of recommendations for modeling variation in scRNA-seq data, particularly when using generalized linear models or likelihood-based approaches for preprocessing and downstream analysis.
2022-01-20The immunoregulatory landscape of human tuberculosis granulomasMcCaffrey EF, Donato M, Keren L, Chen Z, Delmastro A, Fitzpatrick MB, Gupta S, Greenwald NF, Baranski A, Graf W, Kumar R, Bosse M, Fullaway CC, Ramdial PK, Forgó E, Jojic V, Van Valen D, Mehra S, Khader SA, Bendall SC, van de Rijn M, Kalman D, Kaushal D, Hunter RL, Banaei N, Steyn AJC, Khatri P, Angelo MRTI-StanfordTuberculosis (TB) in humans is characterized by formation of immune-rich granulomas in infected tissues, the architecture and composition of which are thought to affect disease outcome. However, our understanding of the spatial relationships that control human granulomas is limited. Here, we used multiplexed ion beam imaging by time of flight (MIBI-TOF) to image 37 proteins in tissues from patients with active TB. We constructed a comprehensive atlas that maps 19 cell subsets across 8 spatial microenvironments. This atlas shows an IFN-γ-depleted microenvironment enriched for TGF-β, regulatory T cells and IDO1+ PD-L1+ myeloid cells. In a further transcriptomic meta-analysis of peripheral blood from patients with TB, immunoregulatory trends mirror those identified by granuloma imaging. Notably, PD-L1 expression is associated with progression to active TB and treatment response. These data indicate that in TB granulomas, there are local spatially coordinated immunoregulatory programs with systemic manifestations that define active TB.
2022-01-20Transition to invasive breast cancer is associated with progressive changes in the structure and composition of tumor stromaRisom T, Glass DR, Averbukh I, Liu CC, Baranski A, Kagel A, McCaffrey EF, Greenwald NF, Rivero-Gutiérrez B, Strand SH, Varma S, Kong A, Keren L, Srivastava S, Zhu C, Khair Z, Veis DJ, Deschryver K, Vennam S, Maley C, Hwang ES, Marks JR, Bendall SC, Colditz GA, West RB, Angelo MRTI-StanfordDuctal carcinoma in situ (DCIS) is a pre-invasive lesion that is thought to be a precursor to invasive breast cancer (IBC). To understand the changes in the tumor microenvironment (TME) accompanying transition to IBC, we used multiplexed ion beam imaging by time of flight (MIBI-TOF) and a 37-plex antibody staining panel to interrogate 79 clinically annotated surgical resections using machine learning tools for cell segmentation, pixel-based clustering, and object morphometrics. Comparison of normal breast with patient-matched DCIS and IBC revealed coordinated transitions between four TME states that were delineated based on the location and function of myoepithelium, fibroblasts, and immune cells. Surprisingly, myoepithelial disruption was more advanced in DCIS patients that did not develop IBC, suggesting this process could be protective against recurrence. Taken together, this HTAN Breast PreCancer Atlas study offers insight into drivers of IBC relapse and emphasizes the importance of the TME in regulating these processes.
2022-01-28The Blood Proteoform Atlas: A reference map of proteoforms in human hematopoietic cellsMelani RD, Gerbasi VR, Anderson LC, Sikora JW, Toby TK, Hutton JE, Butcher DS, Negrão F, Seckler HS, Srzentić K, Fornelli L, Camarillo JM, LeDuc RD, Cesnik AJ, Lundberg E, Greer JB, Fellers RT, Robey MT, DeHart CJ, Forte E, Hendrickson CL, Abbatiello SE, Thomas PM, Kokaji AI, Levitsky J, Kelleher NLRTI-NorthwesternHuman biology is tightly linked to proteins, yet most measurements do not precisely determine alternatively spliced sequences or posttranslational modifications. Here, we present the primary structures of ~30,000 unique proteoforms, nearly 10 times more than in previous studies, expressed from 1690 human genes across 21 cell types and plasma from human blood and bone marrow. The results, compiled in the Blood Proteoform Atlas (BPA), indicate that proteoforms better describe protein-level biology and are more specific indicators of differentiation than their corresponding proteins, which are more broadly expressed across cell types. We demonstrate the potential for clinical application, by interrogating the BPA in the context of liver transplantation and identifying cell and proteoform signatures that distinguish normal graft function from acute rejection and other causes of graft dysfunction.
2022-01-31Spatial genomics enables multi-modal study of clonal heterogeneity in tissuesZhao T, Chiang ZD, Morriss JW, LaFave LM, Murray EM, Del Priore I, Meli K, Lareau CA, Nadaf NM, Li J, Earl AS, Macosko EZ, Jacks T, Buenrostro JD, Chen FRTI-BroadThe state and behaviour of a cell can be influenced by both genetic and environmental factors. In particular, tumour progression is determined by underlying genetic aberrations1-4 as well as the makeup of the tumour microenvironment5,6. Quantifying the contributions of these factors requires new technologies that can accurately measure the spatial location of genomic sequence together with phenotypic readouts. Here we developed slide-DNA-seq, a method for capturing spatially resolved DNA sequences from intact tissue sections. We demonstrate that this method accurately preserves local tumour architecture and enables the de novo discovery of distinct tumour clones and their copy number alterations. We then apply slide-DNA-seq to a mouse model of metastasis and a primary human cancer, revealing that clonal populations are confined to distinct spatial regions. Moreover, through integration with spatial transcriptomics, we uncover distinct sets of genes that are associated with clone-specific genetic aberrations, the local tumour microenvironment, or both. Together, this multi-modal spatial genomics approach provides a versatile platform for quantifying how cell-intrinsic and cell-extrinsic factors contribute to gene expression, protein abundance and other cellular phenotypes.
2022-02-01rPAC: Route based pathway analysis for cohorts of gene expression data setsJoshi P, Basso B, Wang H, Hong SH, Giardina C, Shin DGTMC-UConn/ScrippsPathway analysis is a popular method aiming to derive biological interpretation from high-throughput gene expression studies. However, existing methods focus mostly on identifying which pathway or pathways could have been perturbed, given differential gene expression patterns. In this paper, we present a novel pathway analysis framework, namely rPAC, which decomposes each signaling pathway route into two parts, the upstream portion of a transcription factor (TF) block and the downstream portion from the TF block and generates a pathway route perturbation analysis scheme examining disturbance scores assigned to both parts together. This rPAC scoring is further applied to a cohort of gene expression data sets which produces two summary metrics, "Proportion of Significance" (PS) and "Average Route Score" (ARS), as quantitative measures discerning perturbed pathway routes within and/or between cohorts. To demonstrate rPAC's scoring competency, we first used a large amount of simulated data and compared the method's performance against those by conventional methods in terms of power curve. Next, we performed a case study involving three epithelial cancer data sets from The Cancer Genome Atlas (TCGA). The rPAC method revealed specific pathway routes as potential cancer type signatures. A deeper pathway analysis of sub-groups (i.e., age groups in COAD or cancer sub-types in BRCA) resulted in pathway routes that are known to be associated with the sub-groups. In addition, multiple previously uncharacterized pathways routes were identified, potentially suggesting that rPAC is better in deciphering etiology of a disease than conventional methods particularly in isolating routes and sections of perturbed pathways in a finer granularity.
2022-02-11Spatial-CUT&Tag: Spatially resolved chromatin modification profiling at the cellular levelDeng Y, Bartosovic M, Kukanja P, Zhang D, Liu Y, Su G, Enninful A, Bai Z, Castelo-Branco G, Fan RTTD-YaleSpatial omics emerged as a new frontier of biological and biomedical research. Here, we present spatial-CUT&Tag for spatially resolved genome-wide profiling of histone modifications by combining in situ CUT&Tag chemistry, microfluidic deterministic barcoding, and next-generation sequencing. Spatially resolved chromatin states in mouse embryos revealed tissue-type-specific epigenetic regulations in concordance with ENCODE references and provide spatial information at tissue scale. Spatial-CUT&Tag revealed epigenetic control of the cortical layer development and spatial patterning of cell types determined by histone modification in mouse brain. Single-cell epigenomes can be derived in situ by identifying 20-micrometer pixels containing only one nucleus using immunofluorescence imaging. Spatial chromatin modification profiling in tissue may offer new opportunities to study epigenetic regulation, cell function, and fate decision in normal physiology and pathogenesis.
2022-02-11Uncovering Molecular Heterogeneity in the Kidney With Spatially Targeted Mass SpectrometryKruse ARS, Spraggins JMTMC-Vanderbilt (Eye/pancreas)The kidney functions through the coordination of approximately one million multifunctional nephrons in 3-dimensional space. Molecular understanding of the kidney has relied on transcriptomic, proteomic, and metabolomic analyses of kidney homogenate, but these approaches do not resolve cellular identity and spatial context. Mass spectrometry analysis of isolated cells retains cellular identity but not information regarding its cellular neighborhood and extracellular matrix. Spatially targeted mass spectrometry is uniquely suited to molecularly characterize kidney tissue while retaining in situ cellular context. This review summarizes advances in methodology and technology for spatially targeted mass spectrometry analysis of kidney tissue. Profiling technologies such as laser capture microdissection (LCM) coupled to liquid chromatography tandem mass spectrometry provide deep molecular coverage of specific tissue regions, while imaging technologies such as matrix assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) molecularly profile regularly spaced tissue regions with greater spatial resolution. These technologies individually have furthered our understanding of heterogeneity in nephron regions such as glomeruli and proximal tubules, and their combination is expected to profoundly expand our knowledge of the kidney in health and disease.
2022-03-01Spatial mapping of protein composition and tissue organization: a primer for multiplexed antibody-based imagingHickey JW, Neumann EK, Radtke AJ, Camarillo JM, Beuschel RT, Albanese A, McDonough E, Hatler J, Wiblin AE, Fisher J, Croteau J, Small EC, Sood A, Caprioli RM, Angelo RM, Nolan GP, Chung K, Hewitt SM, Germain RN, Spraggins JM, Lundberg E, Snyder MP, Kelleher NL, Saka SKTMC-StanfordTissues and organs are composed of distinct cell types that must operate in concert to perform physiological functions. Efforts to create high-dimensional biomarker catalogs of these cells have been largely based on single-cell sequencing approaches, which lack the spatial context required to understand critical cellular communication and correlated structural organization. To probe in situ biology with sufficient depth, several multiplexed protein imaging methods have been recently developed. Though these technologies differ in strategy and mode of immunolabeling and detection tags, they commonly utilize antibodies directed against protein biomarkers to provide detailed spatial and functional maps of complex tissues. As these promising antibody-based multiplexing approaches become more widely adopted, new frameworks and considerations are critical for training future users, generating molecular tools, validating antibody panels, and harmonizing datasets. In this Perspective, we provide essential resources, key considerations for obtaining robust and reproducible imaging data, and specialized knowledge from domain experts and technology developers.
2022-03-07MITI minimum information guidelines for highly multiplexed tissue imagesSchapiro D, Yapp C, Sokolov A, Reynolds SM, Chen YA, Sudar D, Xie Y, Muhlich J, Arias-Camison R, Arena S, Taylor AJ, Nikolov M, Tyler M, Lin JR, Burlingame EA; Human Tumor Atlas Network, Chang YH, Farhi SL, Thorsson V, Venkatamohan N, Drewes JL, Pe'er D, Gutman DA, Herrmann MD, Gehlenborg N, Bankhead P, Roland JT, Herndon JM, Snyder MP, Angelo M, Nolan G, Swedlow JR, Schultz N, Merrick DT, Mazzili SA, Cerami E, Rodig SJ, Santagata S, Sorger PKHIVE TC-HarvardN/A
2022-03-07TraSig: inferring cell-cell interactions from pseudotime ordering of scRNA-Seq dataLi D, Velazquez JJ, Ding J, Hislop J, Ebrahimkhani MR, Bar-Joseph ZHIVE TC-CMUA major advantage of single cell RNA-sequencing (scRNA-Seq) data is the ability to reconstruct continuous ordering and trajectories for cells. Here we present TraSig, a computational method for improving the inference of cell-cell interactions in scRNA-Seq studies that utilizes the dynamic information to identify significant ligand-receptor pairs with similar trajectories, which in turn are used to score interacting cell clusters. We applied TraSig to several scRNA-Seq datasets and obtained unique predictions that improve upon those identified by prior methods. Functional experiments validate the ability of TraSig to identify novel signaling interactions that impact vascular development in liver organoids.Software https://github.com/doraadong/TraSig .
2022-03-09A single-cell regulatory map of postnatal lung alveologenesis in humans and miceDuong TE, Wu Y, Sos BC, Dong W, Limaye S, Rivier LH, Myers G, Hagood JS, Zhang K.TMC-UCSDEx-utero regulation of the lungs' responses to breathing air and continued alveolar development shape adult respiratory health. Applying single-cell transposome hypersensitive site sequencing (scTHS-seq) to over 80,000 cells, we assembled the first regulatory atlas of postnatal human and mouse lung alveolar development. We defined regulatory modules and elucidated new mechanistic insights directing alveolar septation, including alveolar type 1 and myofibroblast cell signaling and differentiation, and a unique human matrix fibroblast population. Incorporating GWAS, we mapped lung function causal variants to myofibroblasts and identified a pathogenic regulatory unit linked to lineage marker FGF18, demonstrating the utility of chromatin accessibility data to uncover disease mechanism targets. Our regulatory map and analysis model provide valuable new resources to investigate age-dependent and species-specific control of critical developmental processes. Furthermore, these resources complement existing atlas efforts to advance our understanding of lung health and disease across the human lifespan.
2022-03-15Multicellular modules as clinical diagnostic and therapeutic targetsBaertsch MA, Nolan GP, Hickey JWTMC-StanfordThe complex determinants of health and disease can be determined when approached as a system of interactions of biological agents at different scales. Similar to the physicochemical properties that govern nucleic acids and proteins, there should be a finite set of rules that dictate the behavior of cells to form tissues. Thus, the occurrence of disease can be seen as flaws in processes that are governed by rules pertaining to multicellular structures. Multiplexed imaging is a technology that connects information that bridges multiple biological scales (i.e., molecules, cells, and tissues) and enables elucidation of rules associated with the formation of multicellular structures. Uncovering important multicellular structures associated with disease will propel a wave of development of new categories of diagnostics and therapeutics.
2022-03-15Limited extent and consequences of pancreatic SARS-CoV-2 infectionvan der Heide V, Jangra S, Cohen P, Rathnasinghe R, Aslam S, Aydillo T, Geanon D, Handler D, Kelley G, Lee B, Rahman A, Dawson T, Qi J, D'Souza D, Kim-Schulze S, Panzer JK, Caicedo A, Kusmartseva I, Posgai AL, Atkinson MA, Albrecht RA, García-Sastre A, Rosenberg BR, Schotsaert M, Homann DTMC-FloridaConcerns that infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the etiological agent of coronavirus disease 2019 (COVID-19), may cause new-onset diabetes persist in an evolving research landscape, and precise risk assessment is hampered by, at times, conflicting evidence. Here, leveraging comprehensive single-cell analyses of in vitro SARS-CoV-2-infected human pancreatic islets, we demonstrate that productive infection is strictly dependent on the SARS-CoV-2 entry receptor ACE2 and targets practically all pancreatic cell types. Importantly, the infection remains highly circumscribed and largely non-cytopathic and, despite a high viral burden in infected subsets, promotes only modest cellular perturbations and inflammatory responses. Similar experimental outcomes are also observed after islet infection with endemic coronaviruses. Thus, the limits of pancreatic SARS-CoV-2 infection, even under in vitro conditions of enhanced virus exposure, challenge the proposition that in vivo targeting of β cells by SARS-CoV-2 precipitates new-onset diabetes. Whether restricted pancreatic damage and immunological alterations accrued by COVID-19 increase cumulative diabetes risk, however, remains to be evaluated.
2022-03-18Machine Learning Classifies Ferroptosis and Apoptosis Cell Death Modalities with TfR1 ImmunostainingJin J, Schorpp K, Samaga D, Unger K, Hadian K, Stockwell BRTTD-Columbia/Penn StateDetermining cell death mechanisms occurring in patient and animal tissues is a longstanding goal that requires suitable biomarkers and accurate quantification. However, effective methods remain elusive. To develop more powerful and unbiased analytic frameworks, we developed a machine learning approach for automated cell death classification. Image sets were collected of HT-1080 fibrosarcoma cells undergoing ferroptosis or apoptosis and stained with an anti-transferrin receptor 1 (TfR1) antibody, together with nuclear and F-actin staining. Features were extracted using high-content-analysis software, and a classifier was constructed by fitting a multinomial logistic lasso regression model to the data. The prediction accuracy of the classifier within three classes (control, ferroptosis, apoptosis) was 93%. Thus, TfR1 staining, combined with nuclear and F-actin staining, can reliably detect both apoptotic and ferroptotis cells when cell features are analyzed in an unbiased manner using machine learning, providing a method for unbiased analysis of modes of cell death.
2022-03-28Integrating transcription-factor abundance with chromatin accessibility in human erythroid lineage commitmentBaskar R, Chen AF, Favaro P, Reynolds W, Mueller F, Borges L, Jiang S, Park HS, Kool ET, Greenleaf WJ, Bendall SCRTI-StanfordMaster transcription factors (TFs) directly regulate present and future cell states by binding DNA regulatory elements and driving gene-expression programs. Their abundance influences epigenetic priming to different cell fates at the chromatin level, especially in the context of differentiation. In order to link TF protein abundance to changes in TF motif accessibility and open chromatin, we developed InTAC-seq, a method for simultaneous quantification of genome-wide chromatin accessibility and intracellular protein abundance in fixed cells. Our method produces high-quality data and is a cost-effective alternative to single-cell techniques. We showcase our method by purifying bone marrow (BM) progenitor cells based on GATA-1 protein levels and establish high GATA-1-expressing BM cells as both epigenetically and functionally similar to erythroid-committed progenitors.
2022-03-30Spatial-CITE-seq: spatially resolved high-plex protein and whole transcriptome co-mappingFan R, Liu Y, DiStasio M, Su G, Asashima H, Enninful A, Qin X, Deng Y, Bordignon P, Cassano M, Tomayko M, Xu M, Halene S, Craft J, Hafler DTTD-YaleWe present spatial-CITE-seq for high-plex protein and whole transcriptome co-mapping, which was firstly demonstrated for profiling 198 proteins and transcriptome in multiple mouse tissue types. It was then applied to human tissues to measure 283 proteins and transcriptome that revealed spatially distinct germinal center reaction in tonsil and early immune activation in skin at the COVID-19 mRNA vaccine injection site. Spatial-CITE-seq may find a range of applications in biomedical research.
2022-04-01Stratification of chemotherapy-treated stage III colorectal cancer patients using multiplexed imaging and single-cell analysis of T-cell populationsStachtea X, Loughrey MB, Salvucci M, Lindner AU, Cho S, McDonough E, Sood A, Graf J, Santamaria-Pang A, Corwin A, Laurent-Puig P, Dasgupta S, Shia J, Owens JR, Abate S, Van Schaeybroeck S, Lawler M, Prehn JHM, Ginty F, Longley DBRTI-GEscdColorectal cancer (CRC) has one of the highest cancer incidences and mortality rates. In stage III, postoperative chemotherapy benefits <20% of patients, while more than 50% will develop distant metastases. Biomarkers for identification of patients at increased risk of disease recurrence following adjuvant chemotherapy are currently lacking. In this study, we assessed immune signatures in the tumor and tumor microenvironment (TME) using an in situ multiplexed immunofluorescence imaging and single-cell analysis technology (Cell DIVETM) and evaluated their correlations with patient outcomes. Tissue microarrays (TMAs) with up to three 1 mm diameter cores per patient were prepared from 117 stage III CRC patients treated with adjuvant fluoropyrimidine/oxaliplatin (FOLFOX) chemotherapy. Single sections underwent multiplexed immunofluorescence staining for immune cell markers (CD45, CD3, CD4, CD8, FOXP3, PD1) and tumor/cell segmentation markers (DAPI, pan-cytokeratin, AE1, NaKATPase, and S6). We used annotations and a probabilistic classification algorithm to build statistical models of immune cell types. Images were also qualitatively assessed independently by a Pathologist as 'high', 'moderate' or 'low', for stromal and total immune cell content. Excellent agreement was found between manual assessment and total automated scores (p < 0.0001). Moreover, compared to single markers, a multi-marker classification of regulatory T cells (Tregs: CD3+/CD4+FOXP3+/PD1-) was significantly associated with disease-free survival (DFS) and overall survival (OS) (p = 0.049 and 0.032) of FOLFOX-treated patients. Our results also showed that PD1- Tregs rather than PD1+ Tregs were associated with improved survival. These findings were supported by results from an independent FOLFOX-treated cohort of 191 stage III CRC patients, where higher PD1- Tregs were associated with an increase overall survival (p = 0.015) for CD3+/CD4+/FOXP3+/PD1-. Overall, compared to single markers, multi-marker classification provided more accurate quantitation of immune cell types with stronger correlations with outcomes.
2022-04-01Proteomics Standards Initiative's ProForma 2.0: Unifying the Encoding of Proteoforms and PeptidoformsLeDuc RD, Deutsch EW, Binz PA, Fellers RT, Cesnik AJ, Klein JA, Van Den Bossche T, Gabriels R, Yalavarthi A, Perez-Riverol Y, Carver J, Bittremieux W, Kawano S, Pullman B, Bandeira N, Kelleher NL, Thomas PM, Vizcaíno JARTI-NorthwesternIt is important for the proteomics community to have a standardized manner to represent all possible variations of a protein or peptide primary sequence, including natural, chemically induced, and artifactual modifications. The Human Proteome Organization Proteomics Standards Initiative in collaboration with several members of the Consortium for Top-Down Proteomics (CTDP) has developed a standard notation called ProForma 2.0, which is a substantial extension of the original ProForma notation developed by the CTDP. ProForma 2.0 aims to unify the representation of proteoforms and peptidoforms. ProForma 2.0 supports use cases needed for bottom-up and middle-/top-down proteomics approaches and allows the encoding of highly modified proteins and peptides using a human- and machine-readable string. ProForma 2.0 can be used to represent protein modifications in a specified or ambiguous location, designated by mass shifts, chemical formulas, or controlled vocabulary terms, including cross-links (natural and chemical) and atomic isotopes. Notational conventions are based on public controlled vocabularies and ontologies. The most up-to-date full specification document and information about software implementations are available at http://psidev.info/proforma.
2022-04-01Putting Humpty Dumpty Back Together Again: What Does Protein Quantification Mean in Bottom-Up Proteomics?Plubell DL, Käll L, Webb-Robertson BJ, Bramer LM, Ives A, Kelleher NL, Smith LM, Montine TJ, Wu CC, MacCoss MJRTI-NorthwesternBottom-up proteomics provides peptide measurements and has been invaluable for moving proteomics into large-scale analyses. Commonly, a single quantitative value is reported for each protein-coding gene by aggregating peptide quantities into protein groups following protein inference or parsimony. However, given the complexity of both RNA splicing and post-translational protein modification, it is overly simplistic to assume that all peptides that map to a singular protein-coding gene will demonstrate the same quantitative response. By assuming that all peptides from a protein-coding sequence are representative of the same protein, we may miss the discovery of important biological differences. To capture the contributions of existing proteoforms, we need to reconsider the practice of aggregating protein values to a single quantity per protein-coding gene.
2022-04-01A complete reference genome improves analysis of human genetic variationAganezov S, Yan SM, Soto DC, Kirsche M, Zarate S, Avdeyev P, Taylor DJ, Shafin K, Shumate A, Xiao C, Wagner J, McDaniel J, Olson ND, Sauria MEG, Vollger MR, Rhie A, Meredith M, Martin S, Lee J, Koren S, Rosenfeld JA, Paten B, Layer R, Chin CS, Sedlazeck FJ, Hansen NF, Miller DE, Phillippy AM, Miga KH, McCoy RC, Dennis MY, Zook JM, Schatz MC.HIVE TC-CMUCompared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome for clinical and functional study. We show how this reference universally improves read mapping and variant calling for 3202 and 17 globally diverse samples sequenced with short and long reads, respectively. We identify hundreds of thousands of variants per sample in previously unresolved regions, showcasing the promise of the T2T-CHM13 reference for evolutionary and biomedical discovery. Simultaneously, this reference eliminates tens of thousands of spurious variants per sample, including reduction of false positives in 269 medically relevant genes by up to a factor of 12. Because of these improvements in variant discovery coupled with population and functional genomic resources, T2T-CHM13 is positioned to replace GRCh38 as the prevailing reference for human genetics.
2022-04-01Analyzing Spatial Transcriptomics Data Using GiottoDel Rossi N, Chen JG, Yuan GC, Dries R.TTD-Cal TechSpatial transcriptomic technologies have been developed rapidly in recent years. The addition of spatial context to expression data holds the potential to revolutionize many fields in biology. However, the lack of computational tools remains a bottleneck that is preventing the broader utilization of these technologies. Recently, we have developed Giotto as a comprehensive, generally applicable, and user-friendly toolbox for spatial transcriptomic data analysis and visualization. Giotto implements a rich set of algorithms to enable robust spatial data analysis. To help users get familiar with the Giotto environment and apply it effectively in analyzing new datasets, we will describe the detailed protocols for applying Giotto without any advanced programming skills. © 2022 Wiley Periodicals LLC. Basic Protocol 1: Getting Giotto set up for use Basic Protocol 2: Pre-processing Basic Protocol 3: Clustering and cell-type identification Basic Protocol 4: Cell-type enrichment and deconvolution analyses Basic Protocol 5: Spatial structure analysis tools Basic Protocol 6: Spatial domain detection by using a hidden Markov random field model Support Protocol 1: Spatial proximity-associated cell-cell interactions Support Protocol 2: Assembly of a registered 3D Giotto object from 2D slices.
2022-04-01Complete genomic and epigenetic maps of human centromeresAltemose N, Logsdon GA, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, Hoyt SJ, Uralsky L, Ryabov FD, Shew CJ, Sauria MEG, Borchers M, Gershman A, Mikheenko A, Shepelev VA, Dvorkina T, Kunyavskaya O, Vollger MR, Rhie A, McCartney AM, Asri M, Lorig-Roach R, Shafin K, Lucas JK, Aganezov S, Olson D, de Lima LG, Potapova T, Hartley GA, Haukness M, Kerpedjiev P, Gusev F, Tigyi K, Brooks S, Young A, Nurk S, Koren S, Salama SR, Paten B, Rogaev EI, Streets A, Karpen GH, Dernburg AF, Sullivan BA, Straight AF, Wheeler TJ, Gerton JL, Eichler EE, Phillippy AM, Timp W, Dennis MY, O'Neill RJ, Zook JM, Schatz MC, Pevzner PA, Diekhans M, Langley CH, Alexandrov IA, Miga KHHIVE TC-CMUExisting human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.
2022-04-01Ductal Carcinoma In Situ of Breast: From Molecular Etiology to Therapeutic ManagementHophan SL, Odnokoz O, Liu H, Luo Y, Khan S, Gradishar W, Zhou Z, Badve S, Torres MA, Wan YTTD-PNNL/NorthwesternDuctal carcinoma in situ (DCIS) makes up a majority of noninvasive breast cancer cases. DCIS is a neoplastic proliferation of epithelial cells within the ductal structure of the breast. Currently, there is little known about the progression of DCIS to invasive ductal carcinoma (IDC), or the molecular etiology behind each DCIS lesion or grade. The DCIS lesions can be heterogeneous in morphology, genetics, cellular biology, and clinical behavior, posing challenges to our understanding of the molecular mechanisms by which approximately half of all DCIS lesions progress to an invasive status. New strategies that pinpoint molecular mechanisms are necessary to overcome this gap in understanding, which is a barrier to more targeted therapy. In this review, we will discuss the etiological factors associated with DCIS, as well as the complexity of each nuclear grade lesion. Moreover, we will discuss the possible molecular features that lead to progression of DCIS to IDC. We will highlight current therapeutic management and areas for improvement.
2022-04-04Supervised Deep Generation of High-Resolution Arterial Phase Computed Tomography Kidney Substructure AtlasLee HH, Tang Y, Bao S, Yang Q, Xu X, Fogo AB, Harris R, de Caestecker MP, Spraggins JM, Heinrich M, Huo Y, Landman BATMC-Vanderbilt (Kidney)The Human BioMolecular Atlas Program (HuBMAP) provides an opportunity to contextualize findings across cellular to organ systems levels. Constructing an atlas target is the primary endpoint for generalizing anatomical information across scales and populations. An initial target of HuBMAP is the kidney organ and arterial phase contrast-enhanced computed tomography (CT) provides distinctive appearance and anatomical context on the internal substructure of kidney organs such as renal context, medulla, and pelvicalyceal system. With the confounding effects of demographics and morphological characteristics of the kidney across large-scale imaging surveys, substantial variation is demonstrated with the internal substructure morphometry and the intensity contrast due to the variance of imaging protocols. Such variability increases the level of difficulty to localize the anatomical features of the kidney substructure in a well-defined spatial reference for clinical analysis. In order to stabilize the localization of kidney substructures in the context of this variability, we propose a high-resolution CT kidney substructure atlas template. Briefly, we introduce a deep learning preprocessing technique to extract the volumetric interest of the abdominal regions and further perform a deep supervised registration pipeline to stably adapt the anatomical context of the kidney internal substructure. To generate and evaluate the atlas template, arterial phase CT scans of 500 control subjects are de-identified and registered to the atlas template with a complete end-to-end pipeline. With stable registration to the abdominal wall and kidney organs, the internal substructure of both left and right kidneys are substantially localized in the high-resolution atlas space. The atlas average template successfully demonstrated the contextual details of the internal structure and was applicable to generalize the morphological variation of internal substructure across patients.
2022-04-06Concerted modification of nucleotides at functional centers of the ribosome revealed by single-molecule RNA modification profilingBailey AD 4th, Talkish J, Ding H, Igel HA, Duran A, Mantripragada S, Paten B, Ares M JrHIVE TC-CMUNucleotides in RNA and DNA are chemically modified by numerous enzymes that alter their function. Eukaryotic ribosomal RNA (rRNA) is modified at more than 100 locations, particularly at highly conserved and functionally important nucleotides. During ribosome biogenesis, modifications are added at various stages of assembly. The existence of differently modified classes of ribosomes in normal cells is unknown because no method exists to simultaneously evaluate the modification status at all sites within a single rRNA molecule. Using a combination of yeast genetics and nanopore direct RNA sequencing, we developed a reliable method to track the modification status of single rRNA molecules at 37 sites in 18S rRNA and 73 sites in 25S rRNA. We use our method to characterize patterns of modification heterogeneity and identify concerted modification of nucleotides found near functional centers of the ribosome. Distinct, undermodified subpopulations of rRNAs accumulate upon loss of Dbp3 or Prp43 RNA helicases, suggesting overlapping roles in ribosome biogenesis. Modification profiles are surprisingly resistant to change in response to many genetic and acute environmental conditions that affect translation, ribosome biogenesis, and pre-mRNA splicing. The ability to capture single molecule RNA modification profiles provides new insights into the roles of nucleotide modifications in RNA function.
2022-04-12Mapping the Proteoform Landscape of Five Human TissuesDrown BS, Jooß K, Melani RD, Lloyd-Jones C, Camarillo JM, Kelleher NLRTI-NorthwesternA functional understanding of the human body requires structure-function studies of proteins at scale. The chemical structure of proteins is controlled at the transcriptional, translational, and post-translational levels, creating a variety of products with modulated functions within the cell. The term "proteoform" encapsulates this complexity at the level of chemical composition. Comprehensive mapping of the proteoform landscape in human tissues necessitates analytical techniques with increased sensitivity and depth of coverage. Here, we took a top-down proteomics approach, combining data generated using capillary zone electrophoresis (CZE) and nanoflow reversed-phase liquid chromatography (RPLC) hyphenated to mass spectrometry to identify and characterize proteoforms from the human lungs, heart, spleen, small intestine, and kidneys. CZE and RPLC provided complementary post-translational modification and proteoform selectivity, thereby enhancing the overall proteome coverage when used in combination. Of the 11,466 proteoforms identified in this study, 7373 (64%) were not reported previously. Large differences in the protein and proteoform level were readily quantified, with initial inferences about proteoform biology operative in the analyzed organs. Differential proteoform regulation of defensins, glutathione transferases, and sarcomeric proteins across tissues generate hypotheses about how they function and are regulated in human health and disease.
2022-04-12Referenced Kendrick Mass Defect Annotation and Class-Based Filtering of Imaging MS Lipidomics ExperimentsRichardson LT, Neumann EK, Caprioli RM, Spraggins JM, Solouki TTMC-Vanderbilt (Kidney)Because of their diverse functionalities in cells, lipids are of primary importance when characterizing molecular profiles of physiological and disease states. Imaging mass spectrometry (IMS) provides the spatial distributions of lipid populations in tissues. Referenced Kendrick mass defect (RKMD) analysis is an effective mass spectrometry (MS) data analysis tool for classification and annotation of lipids. Herein, we extend the capabilities of RKMD analysis and demonstrate an integrated method for lipid annotation and chemical structure-based filtering for IMS datasets. Annotation of lipid features with lipid molecular class, radyl carbon chain length, and degree of unsaturation allows image reconstruction and visualization based on each structural characteristic. We show a proof-of-concept application of the method to a computationally generated IMS dataset and validate that the RKMD method is highly specific for lipid components in the presence of confounding background ions. Moreover, we demonstrate an application of the RKMD-based annotation and filtering to matrix-assisted laser desorption/ionization (MALDI) IMS lipidomic data from human kidney tissue analysis.
2022-04-13Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learningGreenwald NF, Miller G, Moen E, Kong A, Kagel A, Dougherty T, Fullaway CC, McIntosh BJ, Leow KX, Schwartz MS, Pavelchek C, Cui S, Camplisson I, Bar-Tal O, Singh J, Fong M, Chaudhry G, Abraham Z, Moseley J, Warshawsky S, Soon E, Greenbaum S, Risom T, Hollmann T, Bendall SC, Keren L, Graf W, Angelo M, Van Valen DRTI-StanfordA principal challenge in the analysis of tissue imaging data is cell segmentation-the task of identifying the precise boundary of every cell in an image. To address this problem we constructed TissueNet, a dataset for training segmentation models that contains more than 1 million manually labeled cells, an order of magnitude more than all previously published segmentation training datasets. We used TissueNet to train Mesmer, a deep-learning-enabled segmentation algorithm. We demonstrated that Mesmer is more accurate than previous methods, generalizes to the full diversity of tissue types and imaging platforms in TissueNet, and achieves human-level performance. Mesmer enabled the automated extraction of key cellular features, such as subcellular localization of protein signal, which was challenging with previous approaches. We then adapted Mesmer to harness cell lineage information in highly multiplexed datasets and used this enhanced version to quantify cell morphology changes during human gestation. All code, data and models are released as a community resource.
2022-04-14Membrane marker selection for segmenting single cell spatial proteomics dataDayao MT, Brusko M, Wasserfall C, Bar-Joseph ZHIVE TC-CMUThe ability to profile spatial proteomics at the single cell level enables the study of cell types, their spatial distribution, and interactions in several tissues and conditions. Current methods for cell segmentation in such studies rely on known membrane or cell boundary markers. However, for many tissues, an optimal set of markers is not known, and even within a tissue, different cell types may express different markers. Here we present RAMCES, a method that uses a convolutional neural network to learn the optimal markers for a new sample and outputs a weighted combination of the selected markers for segmentation. Testing RAMCES on several existing datasets indicates that it correctly identifies cell boundary markers, improving on methods that rely on a single marker or those that extend nuclei segmentations. Application to new spatial proteomics data demonstrates its usefulness for accurately assigning cell types based on the proteins expressed in segmented cells.
2022-04-14Interactive single-cell data analysis using CellarHasanaj E, Wang J, Sarathi A, Ding J, Bar-Joseph ZHIVE TC-CMUCell type assignment is a major challenge for all types of high throughput single cell data. In many cases such assignment requires the repeated manual use of external and complementary data sources. To improve the ability to uniformly assign cell types across large consortia, platforms and modalities, we developed Cellar, a software tool that provides interactive support to all the different steps involved in the assignment and dataset comparison process. We discuss the different methods implemented by Cellar, how these can be used with different data types, how to combine complementary data types and how to analyze and visualize spatial data. We demonstrate the advantages of Cellar by using it to annotate several HuBMAP datasets from multi-omics single-cell sequencing and spatial proteomics studies. Cellar is open-source and includes several annotated HuBMAP datasets.
2022-05-03Crowdsourced RNA design discovers diverse, reversible, efficient, self-contained molecular switchesAndreasson JOL, Gotrik MR, Wu MJ, Wayment-Steele HK, Kladwang W, Portela F, Wellington-Oguri R; Eterna Participants, Das R, Greenleaf WJTMC-StanfordSignificance: Our manuscript presents a paradigm for carrying out distributed science. We have harnessed an online RNA design game, Eterna, to challenge a large community of RNA designers to create diverse RNA sensors. RNA is an attractive, biocompatible substrate for the design and implementation of molecular sensors. We tasked the diverse Eterna community, comprising a global network of molecular design enthusiasts, to submit thousands to tens of thousands of "solutions" to these RNA sensor design challenges. Crucially, community designs were synthesized and tested experimentally in the real world using high-throughput methods for biochemical assays built on repurposed DNA sequencers. The best player-generated designs for RNA sensors approached the thermodynamic optimum.
2022-05-03PHGDH expression increases with progression of Alzheimer’s disease pathology and symptomsChen X, Calandrelli R, Girardini J, Yan Z, Tan Z, Xu X, Hiniker A, Zhong STTD-UCSD/City of HopeChen et al. reveal an increase of phosphoglycerate dehydrogenase (PHGDH) mRNA and protein levels in two mouse models and four human cohorts in Alzheimer's disease brains compared to age- and sex-matched control brains. The increase of PHGDH expression in human brain correlates with symptomatic development and disease pathology.
2022-05-11Viv: multiscale visualization of high-resolution multiplexed bioimaging data on the webManz T, Gold I, Patterson NH, McCallum C, Keller MS, Herr BW 2nd, Börner K, Spraggins JM, Gehlenborg NHIVE TC-HarvardNA
2022-05-13Nuclear oligo hashing improves differential analysis of single-cell RNA-seqKim HJ, Booth G, Saunders L, Srivatsan S, McFaline-Figueroa JL, Trapnell CTMC-Cal TechSingle-cell RNA sequencing (scRNA-seq) offers a high-resolution molecular view into complex tissues, but suffers from high levels of technical noise which frustrates efforts to compare the gene expression programs of different cell types. "Spike-in" RNA standards help control for technical variation in scRNA-seq, but using them with recently developed, ultra-scalable scRNA-seq methods based on combinatorial indexing is not feasible. Here, we describe a simple and cost-effective method for normalizing transcript counts and subtracting technical variability that improves differential expression analysis in scRNA-seq. The method affixes a ladder of synthetic single-stranded DNA oligos to each cell that appears in its RNA-seq library. With improved normalization we explore chemical perturbations with broad or highly specific effects on gene regulation, including RNA pol II elongation, histone deacetylation, and activation of the glucocorticoid receptor. Our methods reveal that inhibiting histone deacetylation prevents cells from executing their canonical program of changes following glucocorticoid stimulation.
2022-05-31NEAT-seq: simultaneous profiling of intra-nuclear proteins, chromatin accessibility and gene expression in single cellsChen AF, Parks B, Kathiria AS, Ober-Reynolds B, Goronzy JJ, Greenleaf WJTMC-StanfordIn this work, we describe NEAT-seq (sequencing of nuclear protein epitope abundance, chromatin accessibility and the transcriptome in single cells), enabling interrogation of regulatory mechanisms spanning the central dogma. We apply this technique to profile CD4 memory T cells using a panel of master transcription factors (TFs) that drive T cell subsets and identify examples of TFs with regulatory activity gated by transcription, translation and regulation of chromatin binding. We also link a noncoding genome-wide association study single-nucleotide polymorphism (SNP) within a GATA motif to a putative target gene, using NEAT-seq data to internally validate SNP impact on GATA3 regulation.
2022-06-01Revealing new biology from multiplexed, metal-isotope-tagged, single-cell readoutsBaskar R, Kimmey SC, Bendall SCRTI-StanfordMass cytometry (MC) is a recent technology that pairs plasma-based ionization of cells in suspension with time-of-flight (TOF) mass spectrometry to sensitively quantify the single-cell abundance of metal-isotope-tagged affinity reagents to key proteins, RNA, and peptides. Given the ability to multiplex readouts (~50 per cell) and capture millions of cells per experiment, MC offers a robust way to assay rare, transitional cell states that are pertinent to human development and disease. Here, we review MC approaches that let us probe the dynamics of cellular regulation across multiple conditions and sample types in a single experiment. Additionally, we discuss current limitations and future extensions of MC as well as computational tools commonly used to extract biological insight from single-cell proteomic datasets.
2022-06-06Cell Trafficking at the Intersection of the Tumor-Immune CompartmentsDu W, Nair P, Johnston A, Wu PH, Wirtz DTMC-JHUMigration is an essential cellular process that regulates human organ development and homeostasis as well as disease initiation and progression. In cancer, immune and tumor cell migration is strongly associated with immune cell infiltration, immune escape, and tumor cell metastasis, which ultimately account for more than 90% of cancer deaths. The biophysics and molecular regulation of the migration of cancer and immune cells have been extensively studied separately. However, accumulating evidence indicates that, in the tumor microenvironment, the motilities of immune and cancer cells are highly interdependent via secreted factors such as cytokines and chemokines. Tumor and immune cells constantly express these soluble factors, which produce a tightly intertwined regulatory network for these cells' respective migration. A mechanistic understanding of the reciprocal regulation of soluble factor-mediated cell migration can provide critical information for the development of new biomarkers of tumor progression and of tumor response to immuno-oncological treatments. We review the biophysical andbiomolecular basis for the migration of immune and tumor cells and their associated reciprocal regulatory network. We also describe ongoing attempts to translate this knowledge into the clinic.
2022-06-10A reference tissue atlas for the human kidneyHansen J, Sealfon R, Menon R, Eadon MT, Lake BB, Steck B, Anjani K, Parikh S, Sigdel TK, Zhang G, Velickovic D, Barwinska D, Alexandrov T, Dobi D, Rashmi P, Otto EA, Rivera M, Rose MP, Anderton CR, Shapiro JP, Pamreddy A, Winfree S, Xiong Y, He Y, de Boer IH, Hodgin JB, Barisoni L, Naik AS, Sharma K, Sarwal MM, Zhang K, Himmelfarb J, Rovin B, El-Achkar TM, Laszik Z, He JC, Dagher PC, Valerius MT, Jain S, Satlin LM, Troyanskaya OG, Kretzler M, Iyengar R, Azeloglu EU; Kidney Precision Medicine ProjectTMC-UCSDKidney Precision Medicine Project (KPMP) is building a spatially specified human kidney tissue atlas in health and disease with single-cell resolution. Here, we describe the construction of an integrated reference map of cells, pathways, and genes using unaffected regions of nephrectomy tissues and undiseased human biopsies from 56 adult subjects. We use single-cell/nucleus transcriptomics, subsegmental laser microdissection transcriptomics and proteomics, near-single-cell proteomics, 3D and CODEX imaging, and spatial metabolomics to hierarchically identify genes, pathways, and cells. Integrated data from these different technologies coherently identify cell types/subtypes within different nephron segments and the interstitium. These profiles describe cell-level functional organization of the kidney following its physiological functions and link cell subtypes to genes, proteins, metabolites, and pathways. They further show that messenger RNA levels along the nephron are congruent with the subsegmental physiological activity. This reference atlas provides a framework for the classification of kidney disease when multiple molecular mechanisms underlie convergent clinical phenotypes.
2022-06-14Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assembliesMc Cartney AM, Shafin K, Alonge M, Bzikadze AV, Formenti G, Fungtammasan A, Howe K, Jain C, Koren S, Logsdon GA, Miga KH, Mikheenko A, Paten B, Shumate A, Soto DC, Sović I, Wood JMD, Zook JM, Phillippy AM, Rhie AHIVE TC-CMUAdvances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies.
2022-06-14Merfin: improved variant filtering, assembly evaluation and polishing via k-mer validationFormenti G, Rhie A, Walenz BP, Thibaud-Nissen F, Shafin K, Koren S, Myers EW, Jarvis ED, Phillippy AMHIVE TC-CMUVariant calling has been widely used for genotyping and for improving the consensus accuracy of long-read assemblies. Variant calls are commonly hard-filtered with user-defined cutoffs. However, it is impossible to define a single set of optimal cutoffs, as the calls heavily depend on the quality of the reads, the variant caller of choice and the quality of the unpolished assembly. Here, we introduce Merfin, a k-mer based variant-filtering algorithm for improved accuracy in genotyping and genome assembly polishing. Merfin evaluates each variant based on the expected k-mer multiplicity in the reads, independently of the quality of the read alignment and variant caller's internal score. Merfin increased the precision of genotyped calls in several benchmarks, improved consensus accuracy and reduced frameshift errors when applied to human and nonhuman assemblies built from Pacific Biosciences HiFi and continuous long reads or Oxford Nanopore reads, including the first complete human genome. Moreover, we introduce assembly quality and completeness metrics that account for the expected genomic copy numbers.
2022-06-20Single-cell analyses define a continuum of cell state and composition changes in the malignant transformation of polyps to colorectal cancerBecker WR, Nevins SA, Chen DC, Chiu R, Horning AM, Guha TK, Laquindanum R, Mills M, Chaib H, Ladabaum U, Longacre T, Shen J, Esplin ED, Kundaje A, Ford JM, Curtis C, Snyder MP, Greenleaf WJTMC-StanfordTo chart cell composition and cell state changes that occur during the transformation of healthy colon to precancerous adenomas to colorectal cancer (CRC), we generated single-cell chromatin accessibility profiles and single-cell transcriptomes from 1,000 to 10,000 cells per sample for 48 polyps, 27 normal tissues and 6 CRCs collected from patients with or without germline APC mutations. A large fraction of polyp and CRC cells exhibit a stem-like phenotype, and we define a continuum of epigenetic and transcriptional changes occurring in these stem-like cells as they progress from homeostasis to CRC. Advanced polyps contain increasing numbers of stem-like cells, regulatory T cells and a subtype of pre-cancer-associated fibroblasts. In the cancerous state, we observe T cell exhaustion, RUNX1-regulated cancer-associated fibroblasts and increasing accessibility associated with HNF4A motifs in epithelia. DNA methylation changes in sporadic CRC are strongly anti-correlated with accessibility changes along this continuum, further identifying regulatory markers for molecular staging of polyps.
2022-06-30Integrated single cell sequencing and histopathological analyses reveal diverse injury and repair responses in a participant with acute kidney injury: A clinical-molecular-pathologic correlation.Menon R, Bomback AS, Lake BB, Stutzke C, Grewenow SM, Menez S, D'Agati VD, Jain STMC-UCSDNA
2022-06-30Temporal modelling using single-cell transcriptomicsDing J, Sharon N, Bar-Joseph ZHIVE TC-CMUMethods for profiling genes at the single-cell level have revolutionized our ability to study several biological processes and systems including development, differentiation, response programmes and disease progression. In many of these studies, cells are profiled over time in order to infer dynamic changes in cell states and types, sets of expressed genes, active pathways and key regulators. However, time-series single-cell RNA sequencing (scRNA-seq) also raises several new analysis and modelling issues. These issues range from determining when and how deep to profile cells, linking cells within and between time points, learning continuous trajectories, and integrating bulk and single-cell data for reconstructing models of dynamic networks. In this Review, we discuss several approaches for the analysis and modelling of time-series scRNA-seq, highlighting their steps, key assumptions, and the types of data and biological questions they are most appropriate for.
2022-06-30Deep learning identification of stiffness markers in breast cancerSneider A, Kiemen A, Kim JH, Wu PH, Habibi M, White M, Phillip JM, Gu L, Wirtz DTMC-JHUWhile essential to our understanding of solid tumor progression, the study of cell and tissue mechanics has yet to find traction in the clinic. Determining tissue stiffness, a mechanical property known to promote a malignant phenotype in vitro and in vivo, is not part of the standard algorithm for the diagnosis and treatment of breast cancer. Instead, clinicians routinely use mammograms to identify malignant lesions and radiographically dense breast tissue is associated with an increased risk of developing cancer. Whether breast density is related to tumor tissue stiffness, and what cellular and non-cellular components of the tumor contribute the most to its stiffness are not well understood. Through training of a deep learning network and mechanical measurements of fresh patient tissue, we create a bridge in understanding between clinical and mechanical markers. The automatic identification of cellular and extracellular features from hematoxylin and eosin (H&E)-stained slides reveals that global and local breast tissue stiffness best correlate with the percentage of straight collagen. Importantly, the percentage of dense breast tissue does not directly correlate with tissue stiffness or straight collagen content.
2022-07-07scSTEM: clustering pseudotime ordered single-cell dataSong Q, Wang J, Bar-Joseph ZHIVE TC-CMUWe develop scSTEM, single-cell STEM, a method for clustering dynamic profiles of genes in trajectories inferred from pseudotime ordering of single-cell RNA-seq (scRNA-seq) data. scSTEM uses one of several metrics to summarize the expression of genes and assigns a p-value to clusters enabling the identification of significant profiles and comparison of profiles across different paths. Application of scSTEM to several scRNA-seq datasets demonstrates its usefulness and ability to improve downstream analysis of biological processes. scSTEM is available at https://github.com/alexQiSong/scSTEM .
2022-07-07Ferroptosis turns 10: Emerging mechanisms, physiological functions, and therapeutic applicationsStockwell BRTTD-Columbia/Penn StateFerroptosis, a form of cell death driven by iron-dependent lipid peroxidation, was identified as a distinct phenomenon and named a decade ago. Ferroptosis has been implicated in a broad set of biological contexts, from development to aging, immunity, and cancer. This review describes key regulators of this form of cell death within a framework of metabolism, ROS biology, and iron biology. Key concepts and major unanswered questions in the ferroptosis field are highlighted. The next decade promises to yield further breakthroughs in the mechanisms governing ferroptosis and additional ways of harnessing ferroptosis for therapeutic benefit.
2022-07-11Reproducible, high-dimensional imaging in archival human tissue by multiplexed ion beam imaging by time-of-flight (MIBI-TOF)Liu CC, Bosse M, Kong A, Kagel A, Kinders R, Hewitt SM, Varma S, van de Rijn M, Nowak SH, Bendall SC, Angelo MRTI-StanfordMultiplexed ion beam imaging by time-of-flight (MIBI-TOF) is a form of mass spectrometry imaging that uses metal labeled antibodies and secondary ion mass spectrometry to image dozens of proteins simultaneously in the same tissue section. Working with the National Cancer Institute's (NCI) Cancer Immune Monitoring and Analysis Centers (CIMAC), we undertook a validation study, assessing concordance across a dozen serial sections of a tissue microarray of 21 samples that were independently processed and imaged by MIBI-TOF or single-plex immunohistochemistry (IHC) over 12 days. Pixel-level features were highly concordant across all 16 targets assessed in both staining intensity (R2 = 0.94 ± 0.04) and frequency (R2 = 0.95 ± 0.04). Comparison to digitized, single-plex IHC on adjacent serial sections revealed similar concordance (R2 = 0.85 ± 0.08) as well. Lastly, automated segmentation and clustering of eight cell populations found that cell frequencies between serial sections yielded an average correlation of R2 = 0.94 ± 0.05. Taken together, we demonstrate that MIBI-TOF, with well-vetted reagents and automated analysis, can generate consistent and quantitative annotations of clinically relevant cell states in archival human tissue, and more broadly, present a scalable framework for benchmarking multiplexed IHC approaches.
2022-07-12High-Throughput Nano-DESI Mass Spectrometry Imaging of Biological Tissues Using an Integrated Microfluidic ProbeLi X, Hu H, Yin R, Li Y, Sun X, Dey SK, Laskin JTTD-PurdueNanospray desorption electrospray mass spectrometry imaging (nano-DESI MSI) enables quantitative mapping of hundreds of molecules in biological samples with minimal sample pretreatment. We have recently developed an integrated microfluidic probe (iMFP) for nano-DESI MSI. Herein, we describe an improved design of the iMFP for the high-throughput imaging of tissue sections. We increased the dimensions of the primary and spray channels and optimized the spray voltage and solvent flow rate to obtain a stable operation of the iMFP at both low and high scan rates. We observe that the sensitivity, molecular coverage, and spatial resolution obtained using the iMFP do not change to a significant extent as the scan rate increases. Using a scan rate of 0.4 mm/s, we obtained high-quality images of mouse uterine tissue sections (scan area: 3.2 mm × 2.3 mm) in only 9.5 min and of mouse brain tissue (scan area: 7.0 mm × 5.4 mm) in 21.7 min, which corresponds to a 10-15-fold improvement in the experimental throughput. We have also developed a quantitative metric for evaluating the quality of ion images obtained at different scan rates. Using this metric, we demonstrate that the quality of nano-DESI MSI data does not degrade substantially with an increase in the scan rate. The ability to image biological tissues with high throughput using iMFP-based nano-DESI MSI will substantially speed up tissue mapping efforts.
2022-07-18Proteoform-Selective Imaging of Tissues Using Mass SpectrometryYang M, Hu H, Su P, Thomas PM, Camarillo JM, Greer JB, Early BP, Fellers RT, Kelleher NL, Laskin JTTD-PurdueUnraveling the complexity of biological systems relies on the development of new approaches for spatially resolved proteoform-specific analysis of the proteome. Herein, we employ nanospray desorption electrospray ionization mass spectrometry imaging (nano-DESI MSI) for the proteoform-selective imaging of biological tissues. Nano-DESI generates multiply charged protein ions, which is advantageous for their structural characterization using tandem mass spectrometry (MS/MS) directly on the tissue. Proof-of-concept experiments demonstrate that nano-DESI MSI combined with on-tissue top-down proteomics is ideally suited for the proteoform-selective imaging of tissue sections. Using rat brain tissue as a model system, we provide the first evidence of differential proteoform expression in different regions of the brain.
2022-07-25Accelerated identification of disease-causing variants with ultra-rapid nanopore genome sequencingGoenka SD, Gorzynski JE, Shafin K, Fisk DG, Pesout T, Jensen TD, Monlong J, Chang PC, Baid G, Bernstein JA, Christle JW, Dalton KP, Garalde DR, Grove ME, Guillory J, Kolesnikov A, Nattestad M, Ruzhnikov MRZ, Samadi M, Sethia A, Spiteri E, Wright CJ, Xiong K, Zhu T, Jain M, Sedlazeck FJ, Carroll A, Paten B, Ashley EAHIVE TC-CMUWhole-genome sequencing (WGS) can identify variants that cause genetic disease, but the time required for sequencing and analysis has been a barrier to its use in acutely ill patients. In the present study, we develop an approach for ultra-rapid nanopore WGS that combines an optimized sample preparation protocol, distributing sequencing over 48 flow cells, near real-time base calling and alignment, accelerated variant calling and fast variant filtration for efficient manual review. Application to two example clinical cases identified a candidate variant in <8 h from sample preparation to variant identification. We show that this framework provides accurate variant calls and efficient prioritization, and accelerates diagnostic clinical genome sequencing twofold compared with previous approaches.
2022-07-26Hanging drop sample preparation improves sensitivity of spatial proteomicsKwon Y, Piehowski PD, Zhao R, Sontag RL, Moore RJ, Burnum-Johnson KE, Smith RD, Qian WJ, Kelly RT, Zhu YTMC-PNNLSpatial proteomics holds great promise for revealing tissue heterogeneity in both physiological and pathological conditions. However, one significant limitation of most spatial proteomics workflows is the requirement of large sample amounts that blurs cell-type-specific or microstructure-specific information. In this study, we developed an improved sample preparation approach for spatial proteomics and integrated it with our previously-established laser capture microdissection (LCM) and microfluidics sample processing platform. Specifically, we developed a hanging drop (HD) method to improve the sample recovery by positioning a nanowell chip upside-down during protein extraction and tryptic digestion steps. Compared with the commonly-used sitting-drop method, the HD method keeps the tissue pixel away from the container surface, and thus improves the accessibility of the extraction/digestion buffer to the tissue sample. The HD method can increase the MS signal by 7 fold, leading to a 66% increase in the number of identified proteins. An average of 721, 1489, and 2521 proteins can be quantitatively profiled from laser-dissected 10 μm-thick mouse liver tissue pixels with areas of 0.0025, 0.01, and 0.04 mm2, respectively. The improved system was further validated in the study of cell-type-specific proteomes of mouse uterine tissues.
2022-07-29New Views of Old Proteins: Clarifying the Enigmatic ProteomeBurnum-Johnson KE, Conrads TP, Drake RR, Herr AE, Iyengar R, Kelly RT, Lundberg E, MacCoss MJ, Naba A, Nolan GP, Pevzner PA, Rodland KD, Sechi S, Slavov N, Spraggins JM, Van Eyk JE, Vidal M, Vogel C, Walt DR, Kelleher NLTMC-StanfordAll human diseases involve proteins, yet our current tools to characterize and quantify them are limited. To better elucidate proteins across space, time, and molecular composition, we provide a >10 years of projection for technologies to meet the challenges that protein biology presents. With a broad perspective, we discuss grand opportunities to transition the science of proteomics into a more propulsive enterprise. Extrapolating recent trends, we describe a next generation of approaches to define, quantify, and visualize the multiple dimensions of the proteome, thereby transforming our understanding and interactions with human disease in the coming decade.
2022-07-31Clinical Implementation and Initial Experience With a 1.5 Tesla MR-Linac for MR-Guided Radiation Therapy for Gynecologic Cancer: An R-IDEAL Stage 1 and 2a First in Humans Feasibility Study of New Technology ImplementationLakomy DS, Yang J, Vedam S, Wang J, Lee B, Sobremonte A, Castillo P, Hughes N, Mohammedsaid M, Jhingran A, Klopp AH, Choi S, Fuller CD, Lin LLHIVE IEC-PSCPurpose: Magnetic resonance imaging-guided linear accelerator systems (MR-linacs) can facilitate the daily adaptation of radiation therapy plans. Here, we report our early clinical experience using a MR-linac for adaptive radiation therapy of gynecologic malignancies. Methods and materials: Treatments were planned with an Elekta Monaco v5.4.01 and delivered by a 1.5 Tesla Elekta Unity MR-linac. The system offers a choice of daily adaptation based on either position (ATP) or shape (ATS) of the tumor and surrounding normal structures. The ATS approach has the option of manually editing the contours of tumors and surrounding normal structures before the plan is adapted. Here, we documented the duration of each treatment fraction; set-up variability (assessed by isocenter shifts in each plan) between fractions; and, for quality assurance, calculated the percentage of plans meeting the γ-criterion of 3%/3-mm distance to agreement. Deformable accumulated dose calculations were used to compare accumulated versus planned dose for patient treated with exclusively ATP fractions. Results: Of the 10 patients treated with 90 fractions on the MR-linac, most received boost doses to recurrence in nodes or isolated tumors. Each treatment fraction lasted a median 32 minutes; fractions were shorter with ATP than with ATS (30 min vs 42 min, P < .0001). The γ criterion for all fraction plans exceeded >90% (median, 99.9%; range, 92.4%-100%; ie, all plans passed quality assurance testing). The average extent of isocenter shift was <0.5 cm in each axis. The accumulated dose to the gross tumor volume was within 5% of the reference plan for all ATP cases. Accumulated doses for lesions in the pelvic periphery were within <1% of the reference plan as opposed to -1.6% to -4.4% for central pelvic tumors. Conclusions: The MR-linac is a reliable and clinically feasible tool for treating patients with gynecologic cancer.
2022-07-31Multi-contrast computed tomography healthy kidney atlasLee HH, Tang Y, Xu K, Bao S, Fogo AB, Harris R, de Caestecker MP, Heinrich M, Spraggins JM, Huo Y, Landman BATMC-Vanderbilt (Kidney)The construction of three-dimensional multi-modal tissue maps provides an opportunity to spur interdisciplinary innovations across temporal and spatial scales through information integration. While the preponderance of effort is allocated to the cellular level and explore the changes in cell interactions and organizations, contextualizing findings within organs and systems is essential to visualize and interpret higher resolution linkage across scales. There is a substantial normal variation of kidney morphometry and appearance across body size, sex, and imaging protocols in abdominal computed tomography (CT). A volumetric atlas framework is needed to integrate and visualize the variability across scales. However, there is no abdominal and retroperitoneal organs atlas framework for multi-contrast CT. Hence, we proposed a high-resolution CT retroperitoneal atlas specifically optimized for the kidney organ across non-contrast CT and early arterial, late arterial, venous and delayed contrast-enhanced CT. We introduce a deep learning-based volume interest extraction method by localizing the 2D slices with a representative score and crop within the range of the abdominal interest. An automated two-stage hierarchal registration pipeline is then performed to register abdominal volumes to a high-resolution CT atlas template with DEEDS affine and non-rigid registration. To generate and evaluate the atlas framework, multi-contrast modality CT scans of 500 subjects (without reported history of renal disease, age: 15-50 years, 250 males & 250 females) were processed. PDD-Net with affine registration achieved the best overall mean DICE for portal venous phase multi-organs label transfer with the registration pipeline (0.540 ± 0.275, p < 0.0001 Wilcoxon signed-rank test) comparing to the other registration tools. It also demonstrated the best performance with the median DICE over 0.8 in transferring the kidney information to the atlas space. DEEDS perform constantly with stable transferring performance in all phases average mapping including significant clear boundary of kidneys with contrastive characteristics, while PDD-Net only demonstrates a stable kidney registration in the average mapping of early and late arterial, and portal venous phase. The variance mappings demonstrate the low intensity variance in the kidney regions with DEEDS across all contrast phases and with PDD-Net across late arterial and portal venous phase. We demonstrate a stable generalizability of the atlas template for integrating the normal kidney variation from small to large, across contrast modalities and populations with great variability of demographics. The linkage of atlas and demographics provided a better understanding of the variation of kidney anatomy across populations.
2022-08-01Unilateral Radiation Therapy for Tonsillar Cancer: Treatment Outcomes in the Era of Human Papillomavirus, Positron-Emission Tomography, and Intensity Modulated Radiation TherapyTaku N, Chronowski G, Brandon Gunn G, Morrison WH, Gross ND, Moreno AC, Ferrarotto R, Frank SJ, Fuller CD, Goepfert RP, Phan J, Lai SY, Reddy JP, Rosenthal DI, Garden ASHIVE IEC-PSCPurpose: The goal of this study was to evaluate disease, survival, and toxic effects after unilateral radiation therapy treatment for tonsillar cancer. Methods and materials: A retrospective study was performed of patients treated at our institution within the period from 2000 to 2018. Summary statistics were used to assess the cohort by patient characteristics and treatments delivered. The Kaplan-Meier method was used to determine survival outcomes. Results: The cohort comprised 403 patients, including 343 (85%) with clinical and/or radiographic evidence of ipsilateral cervical nodal disease and 181 (45%) with multiple involved nodes. Human papillomavirus was detected in 294 (73%) tumors. Median follow-up time was 5.8 years. Disease relapse was infrequent with local recurrence in 9 (2%) patients, neck recurrence in 13 (3%) patients, and recurrence in the unirradiated contralateral neck in 9 (2%) patients. Five- and 10-year overall survival rates were 94% and 89%, respectively. Gastrostomy tubes were needed in 32 (9%) patients, and no patient had a feeding tube 6 months after therapy. Conclusions: For patients with well-lateralized tonsillar tumors and no clinically evident adenopathy of the contralateral neck, unilateral radiation therapy offers favorable rates of disease outcomes and a relatively low toxicity profile.
2022-08-10Highly multiplexed, label-free proteoform imaging of tissues by individual ion mass spectrometrySu P, McGee JP, Durbin KR, Hollas MAR, Yang M, Neumann EK, Allen JL, Drown BS, Butun FA, Greer JB, Early BP, Fellers RT, Spraggins JM, Laskin J, Camarillo JM, Kafader JO, Kelleher NLConsortiumImaging of proteoforms in human tissues is hindered by low molecular specificity and limited proteome coverage. Here, we introduce proteoform imaging mass spectrometry (PiMS), which increases the size limit for proteoform detection and identification by fourfold compared to reported methods and reveals tissue localization of proteoforms at <80-μm spatial resolution. PiMS advances proteoform imaging by combining ambient nanospray desorption electrospray ionization with ion detection using individual ion mass spectrometry. We demonstrate highly multiplexed proteoform imaging of human kidney, annotating 169 of 400 proteoforms of <70 kDa using top-down MS and a database lookup of ~1000 kidney candidate proteoforms, including dozens of key enzymes in primary metabolism. PiMS images reveal distinct spatial localizations of proteoforms to both anatomical structures and cellular neighborhoods in the vasculature, medulla, and cortex regions of the human kidney. The benefits of PiMS are poised to increase proteome coverage for label-free protein imaging of tissues.
2022-08-16Characterizing cellular heterogeneity in chromatin state with scCUT&Tag-proZhang B, Srivastava A, Mimitou E, Stuart T, Raimondi I, Hao Y, Smibert P, Satija RHIVE MC-NYGCTechnologies that profile chromatin modifications at single-cell resolution offer enormous promise for functional genomic characterization, but the sparsity of the measurements and integrating multiple binding maps represent substantial challenges. Here we introduce single-cell (sc)CUT&Tag-pro, a multimodal assay for profiling protein-DNA interactions coupled with the abundance of surface proteins in single cells. In addition, we introduce single-cell ChromHMM, which integrates data from multiple experiments to infer and annotate chromatin states based on combinatorial histone modification patterns. We apply these tools to perform an integrated analysis across nine different molecular modalities in circulating human immune cells. We demonstrate how these two approaches can characterize dynamic changes in the function of individual genomic elements across both discrete cell states and continuous developmental trajectories, nominate associated motifs and regulators that establish chromatin states and identify extensive and cell-type-specific regulatory priming. Finally, we demonstrate how our integrated reference can serve as a scaffold to map and improve the interpretation of additional scCUT&Tag datasets.
2022-08-17Spatial profiling of chromatin accessibility in mouse and human tissuesDeng Y, Bartosovic M, Ma S, Zhang D, Kukanja P, Xiao Y, Su G, Liu Y, Qin X, Rosoklija GB, Dwork AJ, Mann JJ, Xu ML, Halene S, Craft JE, Leong KW, Boldrini M, Castelo-Branco G, Fan RTTD-YaleCellular function in tissue is dependent on the local environment, requiring new methods for spatial mapping of biomolecules and cells in the tissue context1. The emergence of spatial transcriptomics has enabled genome-scale gene expression mapping2-5, but the ability to capture spatial epigenetic information of tissue at the cellular level and genome scale is lacking. Here we describe a method for spatially resolved chromatin accessibility profiling of tissue sections using next-generation sequencing (spatial-ATAC-seq) by combining in situ Tn5 transposition chemistry6 and microfluidic deterministic barcoding5. Profiling mouse embryos using spatial-ATAC-seq delineated tissue-region-specific epigenetic landscapes and identified gene regulators involved in the development of the central nervous system. Mapping the accessible genome in the mouse and human brain revealed the intricate arealization of brain regions. Applying spatial-ATAC-seq to tonsil tissue resolved the spatially distinct organization of immune cell types and states in lymphoid follicles and extrafollicular zones. This technology progresses spatial biology by enabling spatially resolved chromatin accessibility profiling to improve our understanding of cell identity, cell state and cell fate decision in relation to epigenetic underpinnings in development and disease.
2022-08-19A user-friendly tool for cloud-based whole slide image segmentation with examples from renal histopathologyLutnick B, Manthey D, Becker JU, Ginley B, Moos K, Zuckerman JE, Rodrigues L, Gallan AJ, Barisoni L, Alpers CE, Wang XX, Myakala K, Jones BA, Levi M, Kopp JB, Yoshida T, Zee J, Han SS, Jain S, Rosenberg AZ, Jen KY, Sarder P; Kidney Precision Medicine ProjectTMC-UCSDBackground: Image-based machine learning tools hold great promise for clinical applications in pathology research. However, the ideal end-users of these computational tools (e.g., pathologists and biological scientists) often lack the programming experience required for the setup and use of these tools which often rely on the use of command line interfaces. Methods: We have developed Histo-Cloud, a tool for segmentation of whole slide images (WSIs) that has an easy-to-use graphical user interface. This tool runs a state-of-the-art convolutional neural network (CNN) for segmentation of WSIs in the cloud and allows the extraction of features from segmented regions for further analysis. Results: By segmenting glomeruli, interstitial fibrosis and tubular atrophy, and vascular structures from renal and non-renal WSIs, we demonstrate the scalability, best practices for transfer learning, and effects of dataset variability. Finally, we demonstrate an application for animal model research, analyzing glomerular features in three murine models. Conclusions: Histo-Cloud is open source, accessible over the internet, and adaptable for segmentation of any histological structure regardless of stain.
2022-09-05Without appropriate metadata, data-sharing mandates are pointlessMusen, MAHIVE MC-IUN/A
2022-09-08Phenotypes of disease severity in a cohort of hospitalized COVID-19 patients: Results from the IMPACC studyOzonoff A, Schaenman J, Jayavelu ND, Milliren CE, Calfee CS, Cairns CB, Kraft M, Baden LR, Shaw AC, Krammer F, van Bakel H, Esserman DA, Liu S, Sesma AF, Simon V, Hafler DA, Montgomery RR, Kleinstein SH, Levy O, Bime C, Haddad EK, Erle DJ, Pulendran B, Nadeau KC, Davis MM, Hough CL, Messer WB, Higuita NIA, Metcalf JP, Atkinson MA, Brakenridge SC, Corry D, Kheradmand F, Ehrlich LIR, Melamed E, McComsey GA, Sekaly R, Diray-Arce J, Peters B, Augustine AD, Reed EF, Altman MC, Becker PM, Rouphael N; IMPACC study group membersTMC-FloridaBackground: Better understanding of the association between characteristics of patients hospitalized with coronavirus disease 2019 (COVID-19) and outcome is needed to further improve upon patient management. Methods: Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) is a prospective, observational study of 1164 patients from 20 hospitals across the United States. Disease severity was assessed using a 7-point ordinal scale based on degree of respiratory illness. Patients were prospectively surveyed for 1 year after discharge for post-acute sequalae of COVID-19 (PASC) through quarterly surveys. Demographics, comorbidities, radiographic findings, clinical laboratory values, SARS-CoV-2 PCR and serology were captured over a 28-day period. Multivariable logistic regression was performed. Findings: The median age was 59 years (interquartile range [IQR] 20); 711 (61%) were men; overall mortality was 14%, and 228 (20%) required invasive mechanical ventilation. Unsupervised clustering of ordinal score over time revealed distinct disease course trajectories. Risk factors associated with prolonged hospitalization or death by day 28 included age ≥ 65 years (odds ratio [OR], 2.01; 95% CI 1.28-3.17), Hispanic ethnicity (OR, 1.71; 95% CI 1.13-2.57), elevated baseline creatinine (OR 2.80; 95% CI 1.63- 4.80) or troponin (OR 1.89; 95% 1.03-3.47), baseline lymphopenia (OR 2.19; 95% CI 1.61-2.97), presence of infiltrate by chest imaging (OR 3.16; 95% CI 1.96-5.10), and high SARS-CoV2 viral load (OR 1.53; 95% CI 1.17-2.00). Fatal cases had the lowest ratio of SARS-CoV-2 antibody to viral load levels compared to other trajectories over time (p=0.001). 589 survivors (51%) completed at least one survey at follow-up with 305 (52%) having at least one symptom consistent with PASC, most commonly dyspnea (56% among symptomatic patients). Female sex was the only associated risk factor for PASC. Interpretation: Integration of PCR cycle threshold, and antibody values with demographics, comorbidities, and laboratory/radiographic findings identified risk factors for 28-day outcome severity, though only female sex was associated with PASC. Longitudinal clinical phenotyping offers important insights, and provides a framework for immunophenotyping for acute and long COVID-19.
2022-09-12CINS: Cell Interaction Network inference from Single cell expression dataYuan Y, Cosme C Jr, Adams TS, Schupp J, Sakamoto K, Xylourgidis N, Ruffalo M, Li J, Kaminski N, Bar-Joseph ZHIVE TC-CMUStudies comparing single cell RNA-Seq (scRNA-Seq) data between conditions mainly focus on differences in the proportion of cell types or on differentially expressed genes. In many cases these differences are driven by changes in cell interactions which are challenging to infer without spatial information. To determine cell-cell interactions that differ between conditions we developed the Cell Interaction Network Inference (CINS) pipeline. CINS combines Bayesian network analysis with regression-based modeling to identify differential cell type interactions and the proteins that underlie them. We tested CINS on a disease case control and on an aging mouse dataset. In both cases CINS correctly identifies cell type interactions and the ligands involved in these interactions improving on prior methods suggested for cell interaction predictions. We performed additional mouse aging scRNA-Seq experiments which further support the interactions identified by CINS.
2022-09-15Overcoming selection bias in synthetic lethality predictionSeale C, Tepeli Y, Gonçalves JPTMC-Vanderbilt (Eye/pancreas)Motivation: Synthetic lethality (SL) between two genes occurs when simultaneous loss of function leads to cell death. This holds great promise for developing anti-cancer therapeutics that target synthetic lethal pairs of endogenously disrupted genes. Identifying novel SL relationships through exhaustive experimental screens is challenging, due to the vast number of candidate pairs. Computational SL prediction is therefore sought to identify promising SL gene pairs for further experimentation. However, current SL prediction methods lack consideration for generalizability in the presence of selection bias in SL data. Results: We show that SL data exhibit considerable gene selection bias. Our experiments designed to assess the robustness of SL prediction reveal that models driven by the topology of known SL interactions (e.g. graph, matrix factorization) are especially sensitive to selection bias. We introduce selection bias-resilient synthetic lethality (SBSL) prediction using regularized logistic regression or random forests. Each gene pair is described by 27 molecular features derived from cancer cell line, cancer patient tissue and healthy donor tissue samples. SBSL models are built and tested using approximately 8000 experimentally derived SL pairs across breast, colon, lung and ovarian cancers. Compared to other SL prediction methods, SBSL showed higher predictive performance, better generalizability and robustness to selection bias. Gene dependency, quantifying the essentiality of a gene for cell survival, contributed most to SBSL predictions. Random forests were superior to linear models in the absence of dependency features, highlighting the relevance of mutual exclusivity of somatic mutations, co-expression in healthy tissue and differential expression in tumour samples. Availability and implementation: https://github.com/joanagoncalveslab/sbsl. Supplementary information: Supplementary data are available at Bioinformatics online.
2022-09-20Enhanced Spatial Mapping of Histone Proteoforms in Human Kidney Through MALDI-MSI by High-Field UHMR-Orbitrap DetectionZemaitis KJ, Veličković D, Kew W, Fort KL, Reinhardt-Szyba M, Pamreddy A, Ding Y, Kaushik D, Sharma K, Makarov AA, Zhou M, Paša-Tolić LTTD-PNNLCore histones including H2A, H2B, H3, and H4 are key modulators of cellular repair, transcription, and replication within eukaryotic cells, playing vital roles in the pathogenesis of disease and cellular responses to environmental stimuli. Traditional mass spectrometry (MS)-based bottom-up and top-down proteomics allows for the comprehensive identification of proteins and of post-translational modification (PTM) harboring proteoforms. However, these methodologies have difficulties preserving near-cellular spatial distributions because they typically require laser capture microdissection (LCM) and advanced sample preparation techniques. Herein, we coupled a matrix-assisted laser desorption/ionization (MALDI) source with a Thermo Scientific Q Exactive HF Orbitrap MS upgraded with ultrahigh mass range (UHMR) boards for the first demonstration of complementary high-resolution accurate mass (HR/AM) measurements of proteoforms up to 16.5 kDa directly from tissues using this benchtop mass spectrometer. The platform achieved isotopic resolution throughout the detected mass range, providing confident assignments of proteoforms with low ppm mass error and a considerable increase in duty cycle over other Fourier transform mass analyzers. Proteoform mapping of core histones was demonstrated on sections of human kidney at near-cellular spatial resolution, with several key distributions of histone and other proteoforms noted within both healthy biopsy and a section from a renal cell carcinoma (RCC) containing nephrectomy. The use of MALDI-MS imaging (MSI) for proteoform mapping demonstrates several steps toward high-throughput accurate identification of proteoforms and provides a new tool for mapping biomolecule distributions throughout tissue sections in extended mass ranges.
2022-09-29Concurrent inhibition of CDK2 adds to the anti-tumour activity of CDK4/6 inhibition in GISTSchaefer IM, Hemming ML, Lundberg MZ, Serrata MP, Goldaracena I, Liu N, Yin P, Paulo JA, Gygi SP, George S, Morgan JA, Bertagnolli MM, Sicinska ET, Chu C, Zheng S, Mariño-Enríquez A, Hornick JL, Raut CP, Ou WB, Demetri GD, Saka SK, Fletcher JATTD-HarvardBackground: Advanced gastrointestinal stromal tumour (GIST) is characterised by genomic perturbations of key cell cycle regulators. Oncogenic activation of CDK4/6 results in RB1 inactivation and cell cycle progression. Given that single-agent CDK4/6 inhibitor therapy failed to show clinical activity in advanced GIST, we evaluated strategies for maximising response to therapeutic CDK4/6 inhibition. Methods: Targeted next-generation sequencing and multiplexed protein imaging were used to detect cell cycle regulator aberrations in GIST clinical samples. The impact of inhibitors of CDK2, CDK4 and CDK2/4/6 was determined through cell proliferation and protein detection assays. CDK-inhibitor resistance mechanisms were characterised in GIST cell lines after long-term exposure. Results: We identify recurrent genomic aberrations in cell cycle regulators causing co-activation of the CDK2 and CDK4/6 pathways in clinical GIST samples. Therapeutic co-targeting of CDK2 and CDK4/6 is synergistic in GIST cell lines with intact RB1, through inhibition of RB1 hyperphosphorylation and cell proliferation. Moreover, RB1 inactivation and a novel oncogenic cyclin D1 resulting from an intragenic rearrangement (CCND1::chr11.g:70025223) are mechanisms of acquired CDK-inhibitor resistance in GIST. Conclusions: These studies establish the biological rationale for CDK2 and CDK4/6 co-inhibition as a therapeutic strategy in patients with advanced GIST, including metastatic GIST progressing on tyrosine kinase inhibitors.
2022-10-04Machine learning-assisted elucidation of CD81-CD44 interactions in promoting cancer stemness and extracellular vesicle integrityRamos EK, Tsai CF, Jia Y, Cao Y, Manu M, Taftaf R, Hoffmann AD, El-Shennawy L, Gritsenko MA, Adorno-Cruz V, Schuster EJ, Scholten D, Patel D, Liu X, Patel P, Wray B, Zhang Y, Zhang S, Moore RJ, Mathews JV, Schipma MJ, Liu T, Tokars VL, Cristofanilli M, Shi T, Shen Y, Dashzeveg NK, Liu HTTD-PNNL/NorthwesternTumor-initiating cells with reprogramming plasticity or stem-progenitor cell properties (stemness) are thought to be essential for cancer development and metastatic regeneration in many cancers; however, elucidation of the underlying molecular network and pathways remains demanding. Combining machine learning and experimental investigation, here we report CD81, a tetraspanin transmembrane protein known to be enriched in extracellular vesicles (EVs), as a newly identified driver of breast cancer stemness and metastasis. Using protein structure modeling and interface prediction-guided mutagenesis, we demonstrate that membrane CD81 interacts with CD44 through their extracellular regions in promoting tumor cell cluster formation and lung metastasis of triple negative breast cancer (TNBC) in human and mouse models. In-depth global and phosphoproteomic analyses of tumor cells deficient with CD81 or CD44 unveils endocytosis-related pathway alterations, leading to further identification of a quality-keeping role of CD44 and CD81 in EV secretion as well as in EV-associated stemness-promoting function. CD81 is coexpressed along with CD44 in human circulating tumor cells (CTCs) and enriched in clustered CTCs that promote cancer stemness and metastasis, supporting the clinical significance of CD81 in association with patient outcomes. Our study highlights machine learning as a powerful tool in facilitating the molecular understanding of new molecular targets in regulating stemness and metastasis of TNBC.
2022-10-06A Parallelization Strategy for the Time Efficient Analysis of Thousands of LC/MS Runs in High-Performance Computing Environmentvan Zalm P, Viodé A, Smolen K, Fatou B, Hayati AN, Schlaffner CN, Levy O, Steen J, Steen HTMC-FloridaCombining robust proteomics instrumentation with high-throughput enabling liquid chromatography (LC) systems (e.g., timsTOF Pro and the Evosep One system, respectively) enabled mapping the proteomes of 1000s of samples. Fragpipe is one of the few computational protein identification and quantification frameworks that allows for the time-efficient analysis of such large data sets. However, it requires large amounts of computational power and data storage space that leave even state-of-the-art workstations underpowered when it comes to the analysis of proteomics data sets with 1000s of LC mass spectrometry runs. To address this issue, we developed and optimized a Fragpipe-based analysis strategy for a high-performance computing environment and analyzed 3348 plasma samples (6.4 TB) that were longitudinally collected from hospitalized COVID-19 patients under the auspice of the Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) study. Our parallelization strategy reduced the total runtime by ∼90% from 116 (theoretical) days to just 9 days in the high-performance computing environment. All code is open-source and can be deployed in any Simple Linux Utility for Resource Management (SLURM) high-performance computing environment, enabling the analysis of large-scale high-throughput proteomics studies.
2022-10-21Revving an Engine of Human Metabolism: Activity Enhancement of Triosephosphate Isomerase via Hemi-PhosphorylationSchachner LF, Soye BD, Ro S, Kenney GE, Ives AN, Su T, Goo YA, Jewett MC, Rosenzweig AC, Kelleher NLRTI-NorthwesternTriosephosphate isomerase (TPI) performs the 5th step in glycolysis, operates near the limit of diffusion, and is involved in "moonlighting" functions. Its dimer was found singly phosphorylated at Ser20 (pSer20) in human cells, with this post-translational modification (PTM) showing context-dependent stoichiometry and loss under oxidative stress. We generated synthetic pSer20 proteoforms using cell-free protein synthesis that showed enhanced TPI activity by 4-fold relative to unmodified TPI. Molecular dynamics simulations show that the phosphorylation enables a channel to form that shuttles substrate into the active site. Refolding, kinetic, and crystallographic analyses of point mutants including S20E/G/Q indicate that hetero-dimerization and subunit asymmetry are key features of TPI. Moreover, characterization of an endogenous human TPI tetramer also implicates tetramerization in enzymatic regulation. S20 is highly conserved across eukaryotic TPI, yet most prokaryotes contain E/D at this site, suggesting that phosphorylation of human TPI evolved a new switch to optionally boost an already fast enzyme. Overall, complete characterization of TPI shows how endogenous proteoform discovery can prioritize functional versus bystander PTMs.
2022-11-03Real-time spatial registration for 3D human atlasChen L, Teng D, Zhu T, Kong J, Herr BW, Bueckle A, Börner K, Wang FHIVE MC-IUThe human body is made up of about 37 trillion cells (adults). Each cell has its own unique role and is affected by its neighboring cells and environment. The NIH Human BioMolecular Atlas Program (HuBMAP) aims at developing a 3D atlas of human body consisting of organs, vessels, tissues to singe cells with all 3D spatially registered in a single 3D human atlas using tissues obtained from normal individuals across a wide range of ages. A critical step of building the atlas is to register 3D tissue blocks in real-time to the right location of a human organ, which itself consists of complex 3D sub-structures. The complexity of the 3D organ model, e.g., 35 meshes for a typical kidney, poses a significant computational challenge for the registration. In this paper, we propose a comprehensive framework TICKET (TIssue bloCK rEgisTration) to support tissue block registration for 3D human atlas, including (1) 3D mesh pre-processing, (2) spatial queries on intersection relationship and (3) intersection volume computation between organs and tissue blocks. To minimize search space and computation cost, we develop multi-level indexing at both the anatomical structure level and mesh level, and utilize OpenMP for parallel computing. Considering cuboid based shape of the tissue block, we propose an efficient voxelization-based method to estimate the intersection volume. Our experiments demonstrate that the proposed framework is efficient and practical. TICKET is being integrated into the HuBMAP CCF registration portal [1].
2022-11-04Single-cell spatial proteomic imaging for human neuropathologyVijayaragavan K, Cannon BJ, Tebaykin D, Bossé M, Baranski A, Oliveria JP, Bukhari SA, Mrdjen D, Corces MR, McCaffrey EF, Greenwald NF, Sigal Y, Marquez D, Khair Z, Bruce T, Goldston M, Bharadwaj A, Montine KS, Angelo RM, Montine TJ, Bendall SCRTI-StanfordNeurodegenerative disorders are characterized by phenotypic changes and hallmark proteopathies. Quantifying these in archival human brain tissues remains indispensable for validating animal models and understanding disease mechanisms. We present a framework for nanometer-scale, spatial proteomics with multiplex ion beam imaging (MIBI) for capturing neuropathological features. MIBI facilitated simultaneous, quantitative imaging of 36 proteins on archival human hippocampus from individuals spanning cognitively normal to dementia. Customized analysis strategies identified cell types and proteopathies in the hippocampus across stages of Alzheimer's disease (AD) neuropathologic change. We show microglia-pathologic tau interactions in hippocampal CA1 subfield in AD dementia. Data driven, sample independent creation of spatial proteomic regions identified persistent neurons in pathologic tau neighborhoods expressing mitochondrial protein MFN2, regardless of cognitive status, suggesting a survival advantage. Our study revealed unique insights from multiplexed imaging and data-driven approaches for neuropathologic analysis and serves broadly as a methodology for spatial proteomic analysis of archival human neuropathology. TEASER: Multiplex Ion beam Imaging enables deep spatial phenotyping of human neuropathology-associated cellular and disease features.
2022-11-11Multiset multicover methods for discriminative marker selectionHasanaj E, Alavi A, Gupta A, Póczos B, Bar-Joseph ZHIVE TC-CMUMarkers are increasingly being used for several high-throughput data analysis and experimental design tasks. Examples include the use of markers for assigning cell types in scRNA-seq studies, for deconvolving bulk gene expression data, and for selecting marker proteins in single-cell spatial proteomics studies. Most marker selection methods focus on differential expression (DE) analysis. Although such methods work well for data with a few non-overlapping marker sets, they are not appropriate for large atlas-size datasets where several cell types and tissues are considered. To address this, we define the phenotype cover (PC) problem for marker selection and present algorithms that can improve the discriminative power of marker sets. Analysis of these sets on several marker-selection tasks suggests that these methods can lead to solutions that accurately distinguish different phenotypes in the data.
2022-11-12Modeling community standards for metadata as templates makes data FAIRMusen M.A., O’Connor M. J., Schultes E, Martínez-Romero M., Hardi J., Graybeal, J.HIVE IEC-PSCIt is challenging to determine whether datasets are findable, accessible, interoperable, and reusable (FAIR) because the FAIR Guiding Principles refer to highly idiosyncratic criteria regarding the metadata used to annotate datasets. Specifically, the FAIR principles require metadata to be “rich” and to adhere to “domain-relevant” community standards. Scientific communities should be able to define their own machine-actionable templates for metadata that encode these “rich,” discipline-specific elements. We have explored this template-based approach in the context of two software systems. One system is the CEDAR Workbench, which investigators use to author new metadata. The other is the FAIRware Workbench, which evaluates the metadata of archived datasets for their adherence to community standards. Benefits accrue when templates for metadata become central elements in an ecosystem of tools to manage online datasets—both because the templates serve as a community reference for what constitutes FAIR data, and because they embody that perspective in a form that can be distributed among a variety of software applications to assist with data stewardship and data sharing.
2022-11-14Light-Seq: light-directed in situ barcoding of biomolecules in fixed cells and tissues for spatially indexed sequencingKishi JY, Liu N, West ER, Sheng K, Jordanides JJ, Serrata M, Cepko CL, Saka SK, Yin PTTD-HarvardWe present Light-Seq, an approach for multiplexed spatial indexing of intact biological samples using light-directed DNA barcoding in fixed cells and tissues followed by ex situ sequencing. Light-Seq combines spatially targeted, rapid photocrosslinking of DNA barcodes onto complementary DNAs in situ with a one-step DNA stitching reaction to create pooled, spatially indexed sequencing libraries. This light-directed barcoding enables in situ selection of multiple cell populations in intact fixed tissue samples for full-transcriptome sequencing based on location, morphology or protein stains, without cellular dissociation. Applying Light-Seq to mouse retinal sections, we recovered thousands of differentially enriched transcripts from three cellular layers and discovered biomarkers for a very rare neuronal subtype, dopaminergic amacrine cells, from only four to eight individual cells per section. Light-Seq provides an accessible workflow to combine in situ imaging and protein staining with next generation sequencing of the same cells, leaving the sample intact for further analysis post-sequencing.
2022-11-15GBZ file format for pangenome graphsSirén J, Paten BHIVE TC-CMUMotivation: Pangenome graphs representing aligned genome assemblies are being shared in the text-based Graphical Fragment Assembly format. As the number of assemblies grows, there is a need for a file format that can store the highly repetitive data space efficiently. Results: We propose the GBZ file format based on data structures used in the Giraffe short-read aligner. The format provides good compression, and the files can be efficiently loaded into in-memory data structures. We provide compression and decompression tools and libraries for using GBZ graphs, and we show that they can be efficiently used on a variety of systems. Availability and implementation: C++ and Rust implementations are available at https://github.com/jltsiren/gbwtgraph and https://github.com/jltsiren/gbwt-rs, respectively.
2022-11-15Gestationally dependent immune organization at the maternal-fetal interfaceMoore AR, Vivanco Gonzalez N, Plummer KA, Mitchel OR, Kaur H, Rivera M, Collica B, Goldston M, Filiz F, Angelo M, Palmer TD, Bendall SCRTI-StanfordThe immune system and placenta have a dynamic relationship across gestation to accommodate fetal growth and development. High-resolution characterization of this maternal-fetal interface is necessary to better understand the immunology of pregnancy and its complications. We developed a single-cell framework to simultaneously immuno-phenotype circulating, endovascular, and tissue-resident cells at the maternal-fetal interface throughout gestation, discriminating maternal and fetal contributions. Our data reveal distinct immune profiles across the endovascular and tissue compartments with tractable dynamics throughout gestation that respond to a systemic immune challenge in a gestationally dependent manner. We uncover a significant role for the innate immune system where phagocytes and neutrophils drive temporal organization of the placenta through remarkably diverse populations, including PD-L1+ subsets having compartmental and early gestational bias. Our approach and accompanying datasets provide a resource for additional investigations into gestational immunology and evoke a more significant role for the innate immune system in establishing the microenvironment of early pregnancy.
2022-11-16An optimized approach and inflation media for obtaining complimentary mass spectrometry-based omics data from human lung tissueLukowski JK, Olson H, Velickovic M, Wang J, Kyle JE, Kim YM, Williams SM, Zhu Y, Huyck HL, McGraw MD, Poole C, Rogers L, Misra R, Alexandrov T, Ansong C, Pryhuber GS, Clair G, Adkins JN, Carson JP, Anderton CRTMC-URMCHuman disease states are biomolecularly multifaceted and can span across phenotypic states, therefore it is important to understand diseases on all levels, across cell types, and within and across microanatomical tissue compartments. To obtain an accurate and representative view of the molecular landscape within human lungs, this fragile tissue must be inflated and embedded to maintain spatial fidelity of the location of molecules and minimize molecular degradation for molecular imaging experiments. Here, we evaluated agarose inflation and carboxymethyl cellulose embedding media and determined effective tissue preparation protocols for performing bulk and spatial mass spectrometry-based omics measurements. Mass spectrometry imaging methods were optimized to boost the number of annotatable molecules in agarose inflated lung samples. This optimized protocol permitted the observation of unique lipid distributions within several airway regions in the lung tissue block. Laser capture microdissection of these airway regions followed by high-resolution proteomic analysis allowed us to begin linking the lipidome with the proteome in a spatially resolved manner, where we observed proteins with high abundance specifically localized to the airway regions. We also compared our mass spectrometry results to lung tissue samples preserved using two other inflation/embedding media, but we identified several pitfalls with the sample preparation steps using this preservation method. Overall, we demonstrated the versatility of the inflation method, and we can start to reveal how the metabolome, lipidome, and proteome are connected spatially in human lungs and across disease states through a variety of different experiments.
2022-11-30Annotation of Spatially Resolved Single-cell Data with STELLARBrbić M, Cao K, Hickey JW, Tan Y, Snyder MP, Nolan GP, Leskovec JTMC-StanfordAccurate cell-type annotation from spatially resolved single cells is crucial to understand functional spatial biology that is the basis of tissue organization. However, current computational methods for annotating spatially resolved single-cell data are typically based on techniques established for dissociated single-cell technologies and thus do not take spatial organization into account. Here we present STELLAR, a geometric deep learning method for cell-type discovery and identification in spatially resolved single-cell datasets. STELLAR automatically assigns cells to cell types present in the annotated reference dataset and discovers novel cell types and cell states. STELLAR transfers annotations across different dissection regions, different tissues and different donors, and learns cell representations that capture higher-order tissue structures. We successfully applied STELLAR to CODEX multiplexed fluorescent microscopy data and multiplexed RNA imaging datasets. Within the Human BioMolecular Atlas Program, STELLAR has annotated 2.6 million spatially resolved single cells with dramatic time savings.
2022-11-30UNIFAN: A Tool for Unsupervised Single-Cell Clustering and AnnotationLi D, Ding J, Bar-Joseph ZHIVE TC-CMUUNIFAN is an unsupervised cell type annotation tool for single-cell RNA sequencing data (scRNA-seq). Given single-cell expression data as input, UNIFAN outputs cell clusters as well as annotations for each cluster. The clustering process utilizes information on pathways and biological processes and these are also used to annotate the resulting clusters. In this software article, we focus on how to install UNIFAN and on the main steps involved in using UNIFAN for cell type annotations.
2022-12-06Understanding islet dysfunction in type 2 diabetes through multidimensional pancreatic phenotyping: The Human Pancreas Analysis ProgramShapira SN, Naji A, Atkinson MA, Powers AC, Kaestner KHTMC-Vanderbilt (Eye/pancreas)In this perspective, we provide an overview of a recently established National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) initiative, the Human Pancreas Analysis Program for Type 2 Diabetes (HPAP-T2D). This program is designed to define the molecular pathogenesis of islet dysfunction by studying human pancreatic tissue samples from organ donors with T2D. HPAP-T2D generates detailed datasets of physiological, histological, transcriptomic, epigenomic, and genomic information. Importantly, all data collected, generated, and analyzed by HPAP-T2D are made immediately and freely available through a centralized database, PANC-DB, thus providing a comprehensive data resource for the diabetes research community.
2022-12-08Machine-learning based investigation of prognostic indicators for oncological outcome of pancreatic ductal adenocarcinomaChang J, Liu Y, Saey SA, Chang KC, Shrader HR, Steckly KL, Rajput M, Sonka M, Chan CHFHIVE IEC-PSCIntroduction: Pancreatic ductal adenocarcinoma (PDAC) is an aggressive malignancy with a poor prognosis. Surgical resection remains the only potential curative treatment option for early-stage resectable PDAC. Patients with locally advanced or micrometastatic disease should ideally undergo neoadjuvant therapy prior to surgical resection for an optimal treatment outcome. Computerized tomography (CT) scan is the most common imaging modality obtained prior to surgery. However, the ability of CT scans to assess the nodal status and resectability remains suboptimal and depends heavily on physician experience. Improved preoperative radiographic tumor staging with the prediction of postoperative margin and the lymph node status could have important implications in treatment sequencing. This paper proposes a novel machine learning predictive model, utilizing a three-dimensional convoluted neural network (3D-CNN), to reliably predict the presence of lymph node metastasis and the postoperative positive margin status based on preoperative CT scans. Methods: A total of 881 CT scans were obtained from 110 patients with PDAC. Patients and images were separated into training and validation groups for both lymph node and margin prediction studies. Per-scan analysis and per-patient analysis (utilizing majority voting method) were performed. Results: For a lymph node prediction 3D-CNN model, accuracy was 90% for per-patient analysis and 75% for per-scan analysis. For a postoperative margin prediction 3D-CNN model, accuracy was 81% for per-patient analysis and 76% for per-scan analysis. Discussion: This paper provides a proof of concept that utilizing radiomics and the 3D-CNN deep learning framework may be used preoperatively to improve the prediction of positive resection margins as well as the presence of lymph node metastatic disease. Further investigations should be performed with larger cohorts to increase the generalizability of this model; however, there is a great promise in the use of convoluted neural networks to assist clinicians with treatment selection for patients with PDAC.
2022-12-09Emerging Computational Methods in Mass Spectrometry ImagingHu H, Laskin JTTD-Purdue
Mass spectrometry imaging (MSI) is a powerful analytical technique that generates maps of hundreds of molecules in biological samples with high sensitivity and molecular specificity. Advanced MSI platforms with capability of high-spatial resolution and high-throughput acquisition generate vast amount of data, which necessitates the development of computational tools for MSI data analysis. In addition, computation-driven MSI experiments have recently emerged as enabling technologies for further improving the MSI capabilities with little or no hardware modification. This review provides a critical summary of computational methods and resources developed for MSI data analysis and interpretation along with computational approaches for improving throughput and molecular coverage in MSI experiments. This review is focused on the recently developed artificial intelligence methods and provides an outlook for a future paradigm shift in MSI with transformative computational methods.
2022-12-13Tissue registration and exploration user interfaces in support of a human reference atlasBörner K, Bueckle A, Herr BW 2nd, Cross LE, Quardokus EM, Record EG, Ju Y, Silverstein JC, Browne KM, Jain S, Wasserfall CH, Jorgensen ML, Spraggins JM, Patterson NH, Weber GMConsortiumSeventeen international consortia are collaborating on a human reference atlas (HRA), a comprehensive, high-resolution, three-dimensional atlas of all the cells in the healthy human body. Laboratories around the world are collecting tissue specimens from donors varying in sex, age, ethnicity, and body mass index. However, harmonizing tissue data across 25 organs and more than 15 bulk and spatial single-cell assay types poses challenges. Here, we present software tools and user interfaces developed to spatially and semantically annotate ("register") and explore the tissue data and the evolving HRA. A key part of these tools is a common coordinate framework, providing standard terminologies and data structures for describing specimen, biological structure, and spatial data linked to existing ontologies. As of April 22, 2022, the "registration" user interface has been used to harmonize and publish data on 5,909 tissue blocks collected by the Human Biomolecular Atlas Program (HuBMAP), the Stimulating Peripheral Activity to Relieve Conditions program (SPARC), the Human Cell Atlas (HCA), the Kidney Precision Medicine Project (KPMP), and the Genotype Tissue Expression project (GTEx). Further, 5,856 tissue sections were derived from 506 HuBMAP tissue blocks. The second "exploration" user interface enables consortia to evaluate data quality, explore tissue data spatially within the context of the HRA, and guide data acquisition. A companion website is at https://cns-iu.github.io/HRA-supporting-information/ .
2023-01-06MatrisomeDB 2.0: 2023 updates to the ECM-protein knowledge databaseShao X, Gomez CD, Kapoor N, Considine JM, Grams C, Gao YT, Naba ADP-IllinoisThe extracellular matrix (ECM) is a complex assembly of proteins that constitutes the scaffold organizing cells, tissues, and organs. Over the past decade, mass-spectrometry-based proteomics has become the method of choice to profile the composition of the ECM, or the matrisome, of tissues. To assist non-specialists with the reuse of ECM proteomic datasets, we released MatrisomeDB (https://matrisomedb.org) in 2020. Here, we report the expansion of the database to include 25 new curated studies on the ECM of 24 new tissues in addition to datasets on tissues previously included, more than doubling the size of the original database and achieving near-complete coverage of the in-silico predicted matrisome. We further enhanced data visualization by maps of peptides and post-translational-modifications detected onto domain-based representations and 3D structures of ECM proteins. We also referenced external resources to facilitate the design of targeted mass spectrometry assays. Last, we implemented an abstract-mining tool that generates an enrichment word cloud from abstracts of studies in which a queried protein is found with higher confidence and higher abundance relative to other studies in MatrisomeDB.
2023-01-14Creating a common language for the subanatomy of the ovaryTsui EL, O'Neill KE, LeDuc RD, Shikanov A, Gomez-Lobo V, Laronda MMTMC-UPennNA
2023-01-16Haplotype-aware pantranscriptome analyses using spliced pangenome graphsSibbesen JA, Eizenga JM, Novak AM, Sirén J, Chang X, Garrison E, Paten BHIVE TC-CMUPangenomics is emerging as a powerful computational paradigm in bioinformatics. This field uses population-level genome reference structures, typically consisting of a sequence graph, to mitigate reference bias and facilitate analyses that were challenging with previous reference-based methods. In this work, we extend these methods into transcriptomics to analyze sequencing data using the pantranscriptome: a population-level transcriptomic reference. Our toolchain, which consists of additions to the VG toolkit and a standalone tool, RPVG, can construct spliced pangenome graphs, map RNA sequencing data to these graphs, and perform haplotype-aware expression quantification of transcripts in a pantranscriptome. We show that this workflow improves accuracy over state-of-the-art RNA sequencing mapping methods, and that it can efficiently quantify haplotype-specific transcript expression without needing to characterize the haplotypes of a sample beforehand.
2023-01-18A streamlined tandem tip-based workflow for sensitive nanoscale phosphoproteomicsTsai CF, Wang YT, Hsu CC, Kitata RB, Chu RK, Velickovic M, Zhao R, Williams SM, Chrisler WB, Jorgensen ML, Moore RJ, Zhu Y, Rodland KD, Smith RD, Wasserfall CH, Shi T, Liu TTTD-PNNL/NorthwesternEffective phosphoproteome of nanoscale sample analysis remains a daunting task, primarily due to significant sample loss associated with non-specific surface adsorption during enrichment of low stoichiometric phosphopeptide. We develop a tandem tip phosphoproteomics sample preparation method that is capable of sample cleanup and enrichment without additional sample transfer, and its integration with our recently developed SOP (Surfactant-assisted One-Pot sample preparation) and iBASIL (improved Boosting to Amplify Signal with Isobaric Labeling) approaches provides a streamlined workflow enabling sensitive, high-throughput nanoscale phosphoproteome measurements. This approach significantly reduces both sample loss and processing time, allowing the identification of >3000 (>9500) phosphopeptides from 1 (10) µg of cell lysate using the label-free method without a spectral library. It also enables precise quantification of ~600 phosphopeptides from 100 sorted cells (single-cell level input for the enriched phosphopeptides) and ~700 phosphopeptides from human spleen tissue voxels with a spatial resolution of 200 µm (equivalent to ~100 cells) in a high-throughput manner. The new workflow opens avenues for phosphoproteome profiling of mass-limited samples at the low nanogram level.
2023-01-31Polyphony: an Interactive Transfer Learning Framework for Single-Cell Data AnalysisCheng F, Keller MS, Qu H, Gehlenborg N, Wang QHIVE TC-HarvardReference-based cell-type annotation can significantly reduce time and effort in single-cell analysis by transferring labels from a previously-annotated dataset to a new dataset. However, label transfer by end-to-end computational methods is challenging due to the entanglement of technical (e.g., from different sequencing batches or techniques) and biological (e.g., from different cellular microenvironments) variations, only the first of which must be removed. To address this issue, we propose Polyphony, an interactive transfer learning (ITL) framework, to complement biologists' knowledge with advanced computational methods. Polyphony is motivated and guided by domain experts' needs for a controllable, interactive, and algorithm-assisted annotation process, identified through interviews with seven biologists. We introduce anchors, i.e., analogous cell populations across datasets, as a paradigm to explain the computational process and collect user feedback for model improvement. We further design a set of visualizations and interactions to empower users to add, delete, or modify anchors, resulting in refined cell type annotations. The effectiveness of this approach is demonstrated through quantitative experiments, two hypothetical use cases, and interviews with two biologists. The results show that our anchor-based ITL method takes advantage of both human and machine intelligence in annotating massive single-cell datasets.
2023-01-31A Multicompartmental Diffusion Model for Improved Assessment of Whole-Body Diffusion-weighted Imaging Data and Evaluation of Prostate Cancer Bone MetastasesConlin CC, Feng CH, Digma LA, Rodríguez-Soto AE, Kuperman JM, Rakow-Penner R, Karow DS, White NS, Seibert TM, Hahn ME, Dale AMTMC-UCSD (Female reproduction)Purpose To develop a multicompartmental signal model for whole-body diffusion-weighted imaging (DWI) and apply it to study the diffusion properties of normal tissue and metastatic prostate cancer bone lesions in vivo. Materials and Methods This prospective study (ClinicalTrials.govNCT03440554) included 139 men with prostate cancer (mean age, 70 years ± 9 [SD]). Multicompartmental models with two to four tissue compartments were fit to DWI data from whole-body scans to determine optimal compartmental diffusion coefficients. Bayesian information criterion (BIC) and model-fitting residuals were calculated to quantify model complexity and goodness of fit. Diffusion coefficients for the optimal model (having lowest BIC) were used to compute compartmental signal-contribution maps. The signal intensity ratio (SIR) of bone lesions to normal-appearing bone was measured on these signal-contribution maps and on conventional DWI scans and compared using paired t tests (α = .05). Two-sample t tests (α = .05) were used to compare compartmental signal fractions between lesions and normal-appearing bone. Results Lowest BIC was observed from the four-compartment model, with optimal compartmental diffusion coefficients of 0, 1.1 × 10-3, 2.8 × 10-3, and >3.0 ×10-2 mm2/sec. Fitting residuals from this model were significantly lower than from conventional apparent diffusion coefficient mapping (P < .001). Bone lesion SIR was significantly higher on signal-contribution maps of model compartments 1 and 2 than on conventional DWI scans (P < .008). The fraction of signal from compartments 2, 3, and 4 was also significantly different between metastatic bone lesions and normal-appearing bone tissue (P ≤ .02). Conclusion The four-compartment model best described whole-body diffusion properties. Compartmental signal contributions from this model can be used to examine prostate cancer bone involvement.
2023-02-01193 nm Ultraviolet Photodissociation for the Characterization of Singly Charged Proteoforms Generated by MALDIZemaitis KJ, Zhou M, Kew W, Paša-Tolić LTTD-PNNLMALDI imaging allows for the near-cellular profiling of proteoforms directly from microbial, plant, and mammalian samples. Despite detecting hundreds of proteoforms, identification of unknowns with only intact mass information remains a distinct challenge, even with high mass resolving power and mass accuracy. To this end, many supplementary methods have been used to create experimental databases for accurate mass matching, including bulk or spatially resolved bottom-up and/or top-down proteomics. Herein, we describe the application of 193 nm ultraviolet photodissociation (UVPD) for fragmentation of quadrupole isolated singly charged ubiquitin (m/z 8565) by MALDI-UVPD on a UHMR HF Orbitrap. This platform permitted the high-resolution accurate mass measurement of not just terminal fragments but also large internal fragments. The outlined workflow demonstrates the feasibility of top-down analyses of isolated MALDI protein ions and the potential toward more comprehensive characterization of proteoforms in MALDI imaging applications.
2023-02-01Long noncoding RNA LEENE promotes angiogenesis and ischemic recovery in diabetes modelsTang X, Luo Y, Yuan D, Calandrelli R, Malhi NK, Sriram K, Miao Y, Lou CH, Tsark W, Tapia A, Chen AT, Zhang G, Roeth D, Kalkum M, Wang ZV, Chien S, Natarajan R, Cooke JP, Zhong S, Chen ZBTTD-UCSD/City of HopeImpaired angiogenesis in diabetes is a key process contributing to ischemic diseases such as peripheral arterial disease. Epigenetic mechanisms, including those mediated by long noncoding RNAs (lncRNAs), are crucial links connecting diabetes and the related chronic tissue ischemia. Here we identify the lncRNA that enhances endothelial nitric oxide synthase (eNOS) expression (LEENE) as a regulator of angiogenesis and ischemic response. LEENE expression was decreased in diabetic conditions in cultured endothelial cells (ECs), mouse hind limb muscles, and human arteries. Inhibition of LEENE in human microvascular ECs reduced their angiogenic capacity with a dysregulated angiogenic gene program. Diabetic mice deficient in Leene demonstrated impaired angiogenesis and perfusion following hind limb ischemia. Importantly, overexpression of human LEENE rescued the impaired ischemic response in Leene-knockout mice at tissue functional and single-cell transcriptomic levels. Mechanistically, LEENE RNA promoted transcription of proangiogenic genes in ECs, such as KDR (encoding VEGFR2) and NOS3 (encoding eNOS), potentially by interacting with LEO1, a key component of the RNA polymerase II-associated factor complex and MYC, a crucial transcription factor for angiogenesis. Taken together, our findings demonstrate an essential role for LEENE in the regulation of angiogenesis and tissue perfusion. Functional enhancement of LEENE to restore angiogenesis for tissue repair and regeneration may represent a potential strategy to tackle ischemic vascular diseases.
2023-02-03Optimal gap-affine alignment in O(s) spaceMarco-Sola S, Eizenga JM, Guarracino A, Paten B, Garrison E, Moreto M.HIVE TC-CMUMotivation: Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algorithms based on dynamic programming are strongly limited by quadratic requirements in time and memory. The recently proposed wavefront alignment algorithm (WFA) introduced an efficient algorithm to perform exact gap-affine alignment in O(ns) time, where s is the optimal score and n is the sequence length. Notwithstanding these bounds, WFA's O(s2) memory requirements become computationally impractical for genome-scale alignments, leading to a need for further improvement. Results: In this article, we present the bidirectional WFA algorithm, the first gap-affine algorithm capable of computing optimal alignments in O(s) memory while retaining WFA's time complexity of O(ns). As a result, this work improves the lowest known memory bound O(n) to compute gap-affine alignments. In practice, our implementation never requires more than a few hundred MBs aligning noisy Oxford Nanopore Technologies reads up to 1 Mbp long while maintaining competitive execution times. Availability and implementation: All code is publicly available at https://github.com/smarco/BiWFA-paper. Supplementary information: Supplementary data are available at Bioinformatics online.
2023-02-24Deep Learning Approach for Dynamic Sampling for Multichannel Mass Spectrometry ImagingHelminiak D, Hu H, Laskin J, Hye Ye DTTD-PurdueMass Spectrometry Imaging (MSI), using traditional rectilinear scanning, takes hours to days for high spatial resolution acquisitions. Given that most pixels within a sample's field of view are often neither relevant to underlying biological structures nor chemically informative, MSI presents as a prime candidate for integration with sparse and dynamic sampling algorithms. During a scan, stochastic models determine which locations probabilistically contain information critical to the generation of low-error reconstructions. Decreasing the number of required physical measurements thereby minimizes overall acquisition times. A Deep Learning Approach for Dynamic Sampling (DLADS), utilizing a Convolutional Neural Network (CNN) and encapsulating molecular mass intensity distributions within a third dimension, demonstrates a simulated 70% throughput improvement for Nanospray Desorption Electrospray Ionization (nano-DESI) MSI tissues. Evaluations are conducted between DLADS, a Supervised Learning Approach for Dynamic Sampling, with Least-Squares regression (SLADS-LS), and a Multi-Layer Perceptron (MLP) network (SLADS-Net). When compared with SLADS-LS, limited to a single �/� channel, as well as multichannel SLADS-LS and SLADS-Net, DLADS respectively improves regression performance by 36.7%, 7.0%, and 6.2%, resulting in gains to reconstruction quality of 6.0%, 2.1%, and 3.4% for acquisition of targeted �/�.
2023-02-28Spatially Resolved Top-Down Proteomics of Tissue Sections Based on a Microfluidic Nanodroplet Sample Preparation PlatformLiao YC, Fulcher JM, Degnan DJ, Williams SM, Bramer LM, Veličković D, Zemaitis KJ, Veličković M, Sontag RL, Moore RJ, Paša-Tolić L, Zhu Y, Zhou MTTD-PNNLConventional proteomic approaches measure the averaged signal from mixed cell populations or bulk tissues, leading to the dilution of signals arising from subpopulations of cells that might serve as important biomarkers. Recent developments in bottom-up proteomics have enabled spatial mapping of cellular heterogeneity in tissue microenvironments. However, bottom-up proteomics cannot unambiguously define and quantify proteoforms, which are intact (i.e., functional) forms of proteins capturing genetic variations, alternatively spliced transcripts and posttranslational modifications. Herein, we described a spatially resolved top-down proteomics (TDP) platform for proteoform identification and quantitation directly from tissue sections. The spatial TDP platform consisted of a nanodroplet processing in one pot for trace samples-based sample preparation system and an laser capture microdissection-based cell isolation system. We improved the nanodroplet processing in one pot for trace samples sample preparation by adding benzonase in the extraction buffer to enhance the coverage of nucleus proteins. Using ∼200 cultured cells as test samples, this approach increased total proteoform identifications from 493 to 700; with newly identified proteoforms primarily corresponding to nuclear proteins. To demonstrate the spatial TDP platform in tissue samples, we analyzed laser capture microdissection-isolated tissue voxels from rat brain cortex and hypothalamus regions. We quantified 509 proteoforms within the union of top-down mass spectrometry-based proteoform identification and characterization and TDPortal identifications to match with features from protein mass extractor. Several proteoforms corresponding to the same gene exhibited mixed abundance profiles between two tissue regions, suggesting potential posttranslational modification-specific spatial distributions. The spatial TDP workflow has prospects for biomarker discovery at proteoform level from small tissue sections.
2023-02-28Unsupervised Registration Refinement for Generating Unbiased Eye AtlasLee HH, Tang Y, Bao S, Yang Q, Xu X, Schey KL, Spraggins JM, Huo Y, Landman BATMC-Vanderbilt (Kidney)With the confounding effects of demographics across large-scale imaging surveys, substantial variation is demonstrated with the volumetric structure of orbit and eye anthropometry. Such variability increases the level of difficulty to localize the anatomical features of the eye organs for populational analysis. To adapt the variability of eye organs with stable registration transfer, we propose an unbiased eye atlas template followed by a hierarchical coarse-to-fine approach to provide generalized eye organ context across populations. Furthermore, we retrieved volumetric scans from 1842 healthy patients for generating an eye atlas template with minimal biases. Briefly, we select 20 subject scans and use an iterative approach to generate an initial unbiased template. We then perform metric-based registration to the remaining samples with the unbiased template and generate coarse registered outputs. The coarse registered outputs are further leveraged to train a deep probabilistic network, which aims to refine the organ deformation in unsupervised setting. Computed tomography (CT) scans of 100 de-identified subjects are used to generate and evaluate the unbiased atlas template with the hierarchical pipeline. The refined registration shows the stable transfer of the eye organs, which were well-localized in the high-resolution (0.5 mm3) atlas space and demonstrated a significant improvement of 2.37% Dice for inverse label transfer performance. The subject-wise qualitative representations with surface rendering successfully demonstrate the transfer details of the organ context and showed the applicability of generalizing the morphological variation across patients.
2023-03-01Mass spectrometry-based targeted proteomics for analysis of protein mutationsLin TT, Zhang T, Kitata RB, Liu T, Smith RD, Qian WJ, Shi TTTD-PNNL/NorthwesternCancers are caused by accumulated DNA mutations. This recognition of the central role of mutations in cancer and recent advances in next-generation sequencing, has initiated the massive screening of clinical samples and the identification of 1000s of cancer-associated gene mutations. However, proteomic analysis of the expressed mutation products lags far behind genomic (transcriptomic) analysis. With comprehensive global proteomics analysis, only a small percentage of single nucleotide variants detected by DNA and RNA sequencing have been observed as single amino acid variants due to current technical limitations. Proteomic analysis of mutations is important with the potential to advance cancer biomarker development and the discovery of new therapeutic targets for more effective disease treatment. Targeted proteomics using selected reaction monitoring (also known as multiple reaction monitoring) and parallel reaction monitoring, has emerged as a powerful tool with significant advantages over global proteomics for analysis of protein mutations in terms of detection sensitivity, quantitation accuracy and overall practicality (e.g., reliable identification and the scale of quantification). Herein we review recent advances in the targeted proteomics technology for enhancing detection sensitivity and multiplexing capability and highlight its broad biomedical applications for analysis of protein mutations in human bodily fluids, tissues, and cell lines. Furthermore, we review recent applications of top-down proteomics for analysis of protein mutations. Unlike the commonly used bottom-up proteomics which requires digestion of proteins into peptides, top-down proteomics directly analyzes intact proteins for more precise characterization of mutation isoforms. Finally, general perspectives on the potential of achieving both high sensitivity and high sample throughput for large-scale targeted detection and quantification of important protein mutations are discussed.
2023-03-01Human pancreatic capillaries and nerve fibers persist in type 1 diabetes despite beta cell lossRichardson TM, Saunders DC, Haliyur R, Shrestha S, Cartailler JP, Reinert RB, Petronglo J, Bottino R, Aramandla R, Bradley AM, Jenkins R, Phillips S, Kang H; Human Pancreas Analysis Program; Caidedo A, Powers AC, Brissova MTMC-Vanderbilt (Eye/pancreas)The autonomic nervous system regulates pancreatic function. Islet capillaries are essential for the extension of axonal projections into islets, and both of these structures are important for appropriate islet hormone secretion. Because beta cells provide important paracrine cues for islet glucagon secretion and neurovascular development, we postulated that beta cell loss in type 1 diabetes (T1D) would lead to a decline in intra-islet capillaries and reduction of islet innervation, possibly contributing to abnormal glucagon secretion. To define morphological characteristics of capillaries and nerve fibers in islets and acinar tissue compartments, we analyzed neurovascular assembly across the largest cohort of T1D and normal individuals studied thus far. Because innervation has been studied extensively in rodent models of T1D, we also compared the neurovascular architecture between mouse and human pancreas and assembled transcriptomic profiles of molecules guiding islet angiogenesis and neuronal development. We found striking inter-species differences in islet neurovascular assembly but relatively modest differences at transcriptome level, suggesting post-transcriptional regulation may be involved in this process. To determine if islet neurovascular arrangement is altered following beta cell loss in T1D, we compared pancreatic tissues from non-diabetic, recent-onset T1D (<10 years duration), and longstanding T1D donors (>10 years duration). Both islets and acinar tissue had greater capillary density in recent-onset T1D accompanied by overall greater islet nerve fiber density in recent-onset and longstanding T1D as visualized by a pan-neuronal marker. We did not detect changes in sympathetic axons in either T1D cohort. Additionally, nerve fibers overlapped with extracellular matrix (ECM), supporting its role in the formation and function of axonal processes. These results indicate that pancreatic capillaries and nerve fibers persist in T1D despite beta cell loss, suggesting that alpha cell secretory changes may be decoupled from neurovascular components.
2023-03-01Anatomic nomenclature and 3-dimensional regional model of the human ovary: call for a new paradigmO'Neill KE, Maher JY, Laronda MM, Duncan FE, LeDuc RD, Lujan ME, Oktay KH, Pouch AM, Segars JH, Tsui EL, Zelinski MB, Halvorson LM, Gomez-Lobo VTMC-UPennThe ovaries are the female gonads that are crucial for reproduction, steroid production, and overall health. Historically, the ovary was broadly divided into regions defined as the cortex, medulla, and hilum. This current nomenclature lacks specificity and fails to consider the significant anatomic variations in the ovary. Recent technological advances in imaging modalities and high-resolution omic analyses have brought about the need for revision of the existing definitions, which will facilitate the integration of generated data and enable the characterization of organ subanatomy and function at the cellular level. The creation of these high-resolution multimodal maps of the ovary will enhance collaboration and communication among disciplines and between clinicians and researchers. Beginning in March 2021, the Pediatric and Adolescent Gynecology Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development invited subject-matter experts to participate in a series of workshops and meetings to standardize ovarian nomenclature and define the organ's features. The goal was to develop a spatially defined and semantically consistent terminology of the ovary to support collaborative, team science-based endeavors aimed at generating reference atlases of the human ovary. The group recommended a standardized, 3-dimensional description of the ovary and an ontological approach to the subanatomy of the ovary and definition of follicles. This new greater precision in nomenclature and mapping will better reflect the ovary's heterogeneous composition and function, support the standardization of tissue collection, facilitate functional analyses, and enable clinical and research collaborations. The conceptualization process and outcomes of the effort, which spanned the better part of 2021 and early 2022, are introduced in this article. The institute and the workshop participants encourage researchers and clinicians to adopt the new systems in their everyday work to advance the overarching goal of improving human reproductive health.
2023-03-13Computational Pathology Fusing Spatial TechnologiesBorder S, Lucarelli N, Eadon MT, El-Achkar TM, Jain S, Sarder PHIVE FloridaNA
2023-03-27Specimen, biological structure, and spatial ontologies in support of a Human Reference AtlasHerr BW 2nd, Hardi J, Quardokus EM, Bueckle A, Chen L, Wang F, Caron AR, Osumi-Sutherland D, Musen MA, Börner KHIVE MC-IUThe Human Reference Atlas (HRA) is defined as a comprehensive, three-dimensional (3D) atlas of all the cells in the healthy human body. It is compiled by an international team of experts who develop standard terminologies that they link to 3D reference objects, describing anatomical structures. The third HRA release (v1.2) covers spatial reference data and ontology annotations for 26 organs. Experts access the HRA annotations via spreadsheets and view reference object models in 3D editing tools. This paper introduces the Common Coordinate Framework (CCF) Ontology v2.0.1 that interlinks specimen, biological structure, and spatial data, together with the CCF API that makes the HRA programmatically accessible and interoperable with Linked Open Data (LOD). We detail how real-world user needs and experimental data guide CCF Ontology design and implementation, present CCF Ontology classes and properties together with exemplary usage, and report on validation methods. The CCF Ontology graph database and API are used in the HuBMAP portal, HRA Organ Gallery, and other applications that support data queries across multiple, heterogeneous sources.
2023-03-28Nano-DESI Mass Spectrometry Imaging of Proteoforms in Biological Tissues with High Spatial ResolutionYang M, Unsihuay D, Hu H, Nguele Meke F, Qu Z, Zhang ZY, Laskin JTTD-PurdueMass spectrometry imaging (MSI) is a powerful tool for label-free mapping of the spatial distribution of proteins in biological tissues. We have previously demonstrated imaging of individual proteoforms in biological tissues using nanospray desorption electrospray ionization (nano-DESI), an ambient liquid extraction-based MSI technique. Nano-DESI MSI generates multiply charged protein ions, which is advantageous for their identification using top-down proteomics analysis. In this study, we demonstrate proteoform mapping in biological tissues with a spatial resolution down to 7 μm using nano-DESI MSI. A substantial decrease in protein signals observed in high-spatial-resolution MSI makes these experiments challenging. We have enhanced the sensitivity of nano-DESI MSI experiments by optimizing the design of the capillary-based probe and the thickness of the tissue section. In addition, we demonstrate that oversampling may be used to further improve spatial resolution at little or no expense to sensitivity. These developments represent a new step in MSI-based spatial proteomics, which complements targeted imaging modalities widely used for studying biological systems.
2023-03-31Super-resolution SRS microscopy with A-PoDJang H, Li Y, Fung AA, Bagheri P, Hoang K, Skowronska-Krawczyk D, Chen X, Wu JY, Bintu B, Shi LTMC-WUSTLStimulated Raman scattering (SRS) offers the ability to image metabolic dynamics with high signal-to-noise ratio. However, its spatial resolution is limited by the numerical aperture of the imaging objective and the scattering cross-section of molecules. To achieve super-resolved SRS imaging, we developed a deconvolution algorithm, adaptive moment estimation (Adam) optimization-based pointillism deconvolution (A-PoD) and demonstrated a spatial resolution of lower than 59 nm on the membrane of a single lipid droplet (LD). We applied A-PoD to spatially correlated multiphoton fluorescence imaging and deuterium oxide (D2O)-probed SRS (DO-SRS) imaging from diverse samples to compare nanoscopic distributions of proteins and lipids in cells and subcellular organelles. We successfully differentiated newly synthesized lipids in LDs using A-PoD-coupled DO-SRS. The A-PoD-enhanced DO-SRS imaging method was also applied to reveal metabolic changes in brain samples from Drosophila on different diets. This new approach allows us to quantitatively measure the nanoscopic colocalization of biomolecules and metabolic dynamics in organelles.
2023-03-31Senescent cell population with ZEB1 transcription factor as its main regulator promotes osteoarthritis in cartilage and meniscusSwahn H, Li K, Duffy T, Olmer M, D'Lima DD, Mondala TS, Natarajan P, Head SR, Lotz MKTMC-UConn/ScrippsObjectives: Single-cell level analysis of articular cartilage and meniscus tissues from human healthy and osteoarthritis (OA) knees. Methods: Single-cell RNA sequencing (scRNA-seq) analyses were performed on articular cartilage and meniscus tissues from healthy (n=6, n=7) and OA (n=6, n=6) knees. Expression of genes of interest was validated using immunohistochemistry and RNA-seq and function was analysed by gene overexpression and depletion. Results: scRNA-seq analyses of human knee articular cartilage (70 972 cells) and meniscus (78 017 cells) identified a pathogenic subset that is shared between both tissues. This cell population is expanded in OA and has strong OA and senescence gene signatures. Further, this subset has critical roles in extracellular matrix (ECM) and tenascin signalling and is the dominant sender of signals to all other cartilage and meniscus clusters and a receiver of TGFβ signalling. Fibroblast activating protein (FAP) is also a dysregulated gene in this cluster and promotes ECM degradation. Regulons that are controlled by transcription factor ZEB1 are shared between the pathogenic subset in articular cartilage and meniscus. In meniscus and cartilage cells, FAP and ZEB1 promote expression of genes that contribute to OA pathogenesis, including senescence. Conclusions: These single-cell studies identified a senescent pathogenic cell cluster that is present in cartilage and meniscus and has FAP and ZEB1 as main regulators which are novel and promising therapeutic targets for OA-associated pathways in both tissues.
2023-04-12Foundation models for generalist medical artificial intelligenceMoor M, Banerjee O, Abad ZSH, Krumholz HM, Leskovec J, Topol EJ, Rajpurkar P.TMC-StanfordThe exceptionally rapid development of highly flexible, reusable artificial intelligence (AI) models is likely to usher in newfound capabilities in medicine. We propose a new paradigm for medical AI, which we refer to as generalist medical AI (GMAI). GMAI models will be capable of carrying out a diverse set of tasks using very little or no task-specific labelled data. Built through self-supervision on large, diverse datasets, GMAI will flexibly interpret different combinations of medical modalities, including data from imaging, electronic health records, laboratory results, genomics, graphs or medical text. Models will in turn produce expressive outputs such as free-text explanations, spoken recommendations or image annotations that demonstrate advanced medical reasoning abilities. Here we identify a set of high-impact potential applications for GMAI and lay out specific technical capabilities and training datasets necessary to enable them. We expect that GMAI-enabled applications will challenge current strategies for regulating and validating AI devices for medicine and will shift practices associated with the collection of large medical datasets.
2023-04-13High-Throughput Mass Spectrometry Imaging of Biological Systems: Current Approaches and Future DirectionsJiang LX, Yang M, Wali SN, Laskin JTTD-PurdueIn the past two decades, the power of mass spectrometry imaging (MSI) for the label free spatial mapping of molecules in biological systems has been substantially enhanced by the development of approaches for imaging with high spatial resolution. With the increase in the spatial resolution, the experimental throughput has become a limiting factor for imaging of large samples with high spatial resolution and 3D imaging of tissues. Several experimental and computational approaches have been recently developed to enhance the throughput of MSI. In this critical review, we provide a succinct summary of the current approaches used to improve the throughput of MSI experiments. These approaches are focused on speeding up sampling, reducing the mass spectrometer acquisition time, and reducing the number of sampling locations. We discuss the rate-determining steps for different MSI methods and future directions in the development of high-throughput MSI techniques.
2023-04-19Uncovering the spatial landscape of molecular interactions within the tumor microenvironment through latent spacesDeshpande A, Loth M, Sidiropoulos DN, Zhang S, Yuan L, Bell ATF, Zhu Q, Ho WJ, Santa-Maria C, Gilkes DM, Williams SR, Uytingco CR, Chew J, Hartnett A, Bent ZW, Favorov AV, Popel AS, Yarchoan M, Kiemen A, Wu PH, Fujikura K, Wirtz D, Wood LD, Zheng L, Jaffee EM, Anders RA, Danilova L, Stein-O'Brien G, Kagohara LT, Fertig EJTMC-JHURecent advances in spatial transcriptomics (STs) enable gene expression measurements from a tissue sample while retaining its spatial context. This technology enables unprecedented in situ resolution of the regulatory pathways that underlie the heterogeneity in the tumor as well as the tumor microenvironment (TME). The direct characterization of cellular co-localization with spatial technologies facilities quantification of the molecular changes resulting from direct cell-cell interaction, as it occurs in tumor-immune interactions. We present SpaceMarkers, a bioinformatics algorithm to infer molecular changes from cell-cell interactions from latent space analysis of ST data. We apply this approach to infer the molecular changes from tumor-immune interactions in Visium spatial transcriptomics data of metastasis, invasive and precursor lesions, and immunotherapy treatment. Further transfer learning in matched scRNA-seq data enabled further quantification of the specific cell types in which SpaceMarkers are enriched. Altogether, SpaceMarkers can identify the location and context-specific molecular interactions within the TME from ST data.
2023-04-24Using single cell atlas data to reconstruct regulatory networksSong Q, Ruffalo M, Bar-Joseph ZHIVE TC-CMUInference of global gene regulatory networks from omics data is a long-term goal of systems biology. Most methods developed for inferring transcription factor (TF)-gene interactions either relied on a small dataset or used snapshot data which is not suitable for inferring a process that is inherently temporal. Here, we developed a new computational method that combines neural networks and multi-task learning to predict RNA velocity rather than gene expression values. This allows our method to overcome many of the problems faced by prior methods leading to more accurate and more comprehensive set of identified regulatory interactions. Application of our method to atlas scale single cell data from 6 HuBMAP tissues led to several validated and novel predictions and greatly improved on prior methods proposed for this task.
2023-04-27The HRA Organ Gallery Affords Immersive Superpowers for Building and Exploring the Human Reference Atlas with Virtual RealityBueckle A, Qing C, Luley S, Kumar Y, Pandey N, Börner KHIVE MC-IUThe Human Reference Atlas (HRA, https://humanatlas.io) funded by the NIH Human Biomolecular Atlas Program (HuBMAP, https://commonfund.nih.gov/hubmap) and other projects engages 17 international consortia to create a spatial reference of the healthy adult human body at single-cell resolution. The specimen, biological structure, and spatial data that define the HRA are disparate in nature and benefit from a visually explicit method of data integration. Virtual reality (VR) offers unique means to enable users to explore complex data structures in a three-dimensional (3D) immersive environment. On a 2D desktop application, the 3D spatiality and real-world size of the 3D reference organs of the atlas is hard to understand. If viewed in VR, the spatiality of the organs and tissue blocks mapped to the HRA can be explored in their true size and in a way that goes beyond traditional 2D user interfaces. Added 2D and 3D visualizations can then provide data-rich context. In this paper, we present the HRA Organ Gallery, a VR application to explore the atlas in an integrated VR environment. Presently, the HRA Organ Gallery features 55 3D reference organs, 1,203 mapped tissue blocks from 292 demographically diverse donors and 15 providers that link to 6,000+ datasets; it also features prototype visualizations of cell type distributions and 3D protein structures. We outline our plans to support two biological use cases: on-ramping novice and expert users to HuBMAP data available via the Data Portal (https://portal.hubmapconsortium.org), and quality assurance/quality control (QA/QC) for HRA data providers. Code and onboarding materials are available at https://github.com/cns-iu/hra-organ-gallery-in-vr.
2023-04-28Ten Years of Extracellular Matrix Proteomics: Accomplishments, Challenges, and Future PerspectivesNaba ADP-IllinoisThe extracellular matrix (ECM) is a complex assembly of hundreds of proteins forming the architectural scaffold of multicellular organisms. In addition to its structural role, the ECM conveys signals orchestrating cellular phenotypes. Alterations of ECM composition, abundance, structure, or mechanics have been linked to diseases and disorders affecting all physiological systems, including fibrosis and cancer. Deciphering the protein composition of the ECM and how it changes in pathophysiological contexts is thus the first step toward understanding the roles of the ECM in health and disease and toward the development of therapeutic strategies to correct disease-causing ECM alterations. Potentially, the ECM also represents a vast, yet untapped reservoir of disease biomarkers. ECM proteins are characterized by unique biochemical properties that have hindered their study: they are large, heavily and uniquely posttranslationally modified, and highly insoluble. Overcoming these challenges, we and others have devised mass-spectrometry–based proteomic approaches to define the ECM composition, or “matrisome,” of tissues. This first part of this review provides a historical overview of ECM proteomics research and presents the latest advances that now allow the profiling of the ECM of healthy and diseased tissues. The second part highlights recent examples illustrating how ECM proteomics has emerged as a powerful discovery pipeline to identify prognostic cancer biomarkers. The third part discusses remaining challenges limiting our ability to translate findings to clinical application and proposes approaches to overcome them. Lastly, the review introduces readers to resources available to facilitate the interpretation of ECM proteomics datasets. The ECM was once thought to be impenetrable. Mass spectrometry–based proteomics has proven to be a powerful tool to decode the ECM. In light of the progress made over the past decade, there are reasons to believe that the in-depth exploration of the matrisome is within reach and that we may soon witness the first translational application of ECM proteomics.
2023-04-28Integrated Physiology of the Exocrine and Endocrine Compartments in Pancreatic Diseases: Workshop ProceedingsMastracci TL, Apte M, Amundadottir LT, Alvarsson A, Artandi S, Bellin MD, Bernal-Mizrachi E, Caicedo A, Campbell-Thompson M, Cruz-Monserrate Z, El Ouaamari A, Gaulton KJ, Geisz A, Goodarzi MO, Hara M, Hull-Meichle RL, Kleger A, Klein AP, Kopp JL, Kulkarni RN, Muzumdar MD, Naren AP, Oakes SA, Olesen SS, Phelps EA, Powers AC, Stabler CL, Tirkes T, Whitcomb DC, Yadav D, Yong J, Zaghloul NA, Pandol SJ, Sander MTMC-PNNLThe Integrated Physiology of the Exocrine and Endocrine Compartments in Pancreatic Diseases workshop was a 1.5-day scientific conference at the National Institutes of Health (Bethesda, MD) that engaged clinical and basic science investigators interested in diseases of the pancreas. This report provides a summary of the proceedings from the workshop. The goals of the workshop were to forge connections and identify gaps in knowledge that could guide future research directions. Presentations were segregated into six major theme areas, including 1) pancreas anatomy and physiology, 2) diabetes in the setting of exocrine disease, 3) metabolic influences on the exocrine pancreas, 4) genetic drivers of pancreatic diseases, 5) tools for integrated pancreatic analysis, and 6) implications of exocrine-endocrine cross talk. For each theme, multiple presentations were followed by panel discussions on specific topics relevant to each area of research; these are summarized here. Significantly, the discussions resulted in the identification of research gaps and opportunities for the field to address. In general, it was concluded that as a pancreas research community, we must more thoughtfully integrate our current knowledge of normal physiology as well as the disease mechanisms that underlie endocrine and exocrine disorders so that there is a better understanding of the interplay between these compartments.
2023-04-28Spatial epigenome-transcriptome co-profiling of mammalian tissuesZhang D, Deng Y, Kukanja P, Agirre E, Bartosovic M, Dong M, Ma C, Ma S, Su G, Bao S, Liu Y, Xiao Y, Rosoklija GB, Dwork AJ, Mann JJ, Leong KW, Boldrini M, Wang L, Haeussler M, Raphael BJ, Kluger Y, Castelo-Branco G, Fan RTTD-YaleEmerging spatial technologies, including spatial transcriptomics and spatial epigenomics, are becoming powerful tools for profiling of cellular states in the tissue context1-5. However, current methods capture only one layer of omics information at a time, precluding the possibility of examining the mechanistic relationship across the central dogma of molecular biology. Here, we present two technologies for spatially resolved, genome-wide, joint profiling of the epigenome and transcriptome by cosequencing chromatin accessibility and gene expression, or histone modifications (H3K27me3, H3K27ac or H3K4me3) and gene expression on the same tissue section at near-single-cell resolution. These were applied to embryonic and juvenile mouse brain, as well as adult human brain, to map how epigenetic mechanisms control transcriptional phenotype and cell dynamics in tissue. Although highly concordant tissue features were identified by either spatial epigenome or spatial transcriptome we also observed distinct patterns, suggesting their differential roles in defining cell states. Linking epigenome to transcriptome pixel by pixel allows the uncovering of new insights in spatial epigenetic priming, differentiation and gene regulation within the tissue architecture. These technologies are of great interest in life science and biomedical research.
2023-05-01Conundrums of Choice of 'Normal' Kidney Tissue for Single Cell StudiesJain STMC-WUSTLPurpose of review: Defining molecular changes in key kidney cell types across lifespan and in disease states is essential to understand the pathogenetic basis of disease progression and targeted therapies. Various single cell approaches are being applied to define disease associated molecular signatures. Key considerations include the choice of reference tissue or 'normal' for comparison to diseased human specimens and a benchmark reference atlas. We provide an overview of select single cell technologies, key considerations for experimental design, quality control, choices and challenges associated with assay type and source for reference tissue.
2023-05-05Rapid Multivariate Analysis Approach to Explore Differential Spatial Protein Profiles in TissueSharman K, Patterson NH, Weiss A, Neumann EK, Guiberson ER, Ryan DJ, Gutierrez DB, Spraggins JM, Van de Plas R, Skaar EP, Caprioli RMTMC-Vanderbilt (Kidney)Spatially targeted proteomics analyzes the proteome of specific cell types and functional regions within tissue. While spatial context is often essential to understanding biological processes, interpreting sub-region-specific protein profiles can pose a challenge due to the high-dimensional nature of the data. Here, we develop a multivariate approach for rapid exploration of differential protein profiles acquired from distinct tissue regions and apply it to analyze a published spatially targeted proteomics data set collected from Staphylococcus aureus-infected murine kidney, 4 and 10 days postinfection. The data analysis process rapidly filters high-dimensional proteomic data to reveal relevant differentiating species among hundreds to thousands of measured molecules. We employ principal component analysis (PCA) for dimensionality reduction of protein profiles measured by microliquid extraction surface analysis mass spectrometry. Subsequently, k-means clustering of the PCA-processed data groups samples by chemical similarity. Cluster center interpretation revealed a subset of proteins that differentiate between spatial regions of infection over two time points. These proteins appear involved in tricarboxylic acid metabolomic pathways, calcium-dependent processes, and cytoskeletal organization. Gene ontology analysis further uncovered relationships to tissue damage/repair and calcium-related defense mechanisms. Applying our analysis in infectious disease highlighted differential proteomic changes across abscess regions over time, reflecting the dynamic nature of host-pathogen interactions.
2023-05-10A draft human pangenome referenceLiao WW, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas JK, Monlong J, Abel HJ, Buonaiuto S, Chang XH, Cheng H, Chu J, Colonna V, Eizenga JM, Feng X, Fischer C, Fulton RS, Garg S, Groza C, Guarracino A, Harvey WT, Heumos S, Howe K, Jain M, Lu TY, Markello C, Martin FJ, Mitchell MW, Munson KM, Mwaniki MN, Novak AM, Olsen HE, Pesout T, Porubsky D, Prins P, Sibbesen JA, Sirén J, Tomlinson C, Villani F, Vollger MR, Antonacci-Fulton LL, Baid G, Baker CA, Belyaeva A, Billis K, Carroll A, Chang PC, Cody S, Cook DE, Cook-Deegan RM, Cornejo OE, Diekhans M, Ebert P, Fairley S, Fedrigo O, Felsenfeld AL, Formenti G, Frankish A, Gao Y, Garrison NA, Giron CG, Green RE, Haggerty L, Hoekzema K, Hourlier T, Ji HP, Kenny EE, Koenig BA, Kolesnikov A, Korbel JO, Kordosky J, Koren S, Lee H, Lewis AP, Magalhães H, Marco-Sola S, Marijon P, McCartney A, McDaniel J, Mountcastle J, Nattestad M, Nurk S, Olson ND, Popejoy AB, Puiu D, Rautiainen M, Regier AA, Rhie A, Sacco S, Sanders AD, Schneider VA, Schultz BI, Shafin K, SHIVE TC-CMUHere the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.
2023-05-12Bioorthogonal Chemical Imaging of Cell Metabolism Regulated by Aromatic Amino AcidsBagheri P, Hoang K, Kuo CY, Trivedi H, Jang H, Shi LTMC-WUSTLEssential aromatic amino acids (AAAs) are building blocks for synthesizing new biomasses in cells and sustaining normal biological functions. For example, an abundant supply of AAAs is important for cancer cells to maintain their rapid growth and division. With this, there is a rising demand for a highly specific, noninvasive imaging approach with minimal sample preparation to directly visualize how cells harness AAAs for their metabolism in situ. Here, we develop an optical imaging platform that combines deuterium oxide (D2O) probing with stimulated Raman scattering (DO-SRS) and integrates DO-SRS with two-photon excitation fluorescence (2PEF) into a single microscope to directly visualize the metabolic activities of HeLa cells under AAA regulation. Collectively, the DO-SRS platform provides high spatial resolution and specificity of newly synthesized proteins and lipids in single HeLa cell units. In addition, the 2PEF modality can detect autofluorescence signals of nicotinamide adenine dinucleotide (NADH) and Flavin in a label-free manner. The imaging system described here is compatible with both in vitro and in vivo models, which is flexible for various experiments. The general workflow of this protocol includes cell culture, culture media preparation, cell synchronization, cell fixation, and sample imaging with DO-SRS and 2PEF modalities.
2023-05-15Evaluation of cell segmentation methods without reference segmentationsChen H, Murphy RFHIVE TC-CMUCell segmentation is a cornerstone of many bioimage informatics studies, and inaccurate segmentation introduces error in downstream analysis. Evaluating segmentation results is thus a necessary step for developing segmentation methods as well as for choosing the most appropriate method for a particular type of sample. The evaluation process has typically involved comparison of segmentations with those generated by humans, which can be expensive and subject to unknown bias. We present here an approach to evaluating cell segmentation methods without relying upon comparison to results from humans. For this, we defined a number of segmentation quality metrics that can be applied to multichannel fluorescence images. We calculated these metrics for 14 previously described segmentation methods applied to datasets from four multiplexed microscope modalities covering five tissues. Using principal component analysis to combine the metrics, we defined an overall cell segmentation quality score and ranked the segmentation methods. We found that two deep learning-based methods performed the best overall, but that results for all methods could be significantly improved by postprocessing to ensure proper matching of cell and nuclear masks. Our evaluation tool is available as open source and all code and data are available in a Reproducible Research Archive.
2023-05-15Pangenome graph construction from genome alignments with Minigraph-CactusHickey G, Monlong J, Ebler J, Novak AM, Eizenga JM, Gao Y; Human Pangenome Reference Consortium; Marschall T, Li H, Paten BHIVE TC-CMUPangenome references address biases of reference genomes by storing a representative set of diverse haplotypes and their alignment, usually as a graph. Alternate alleles determined by variant callers can be used to construct pangenome graphs, but advances in long-read sequencing are leading to widely available, high-quality phased assemblies. Constructing a pangenome graph directly from assemblies, as opposed to variant calls, leverages the graph's ability to represent variation at different scales. Here we present the Minigraph-Cactus pangenome pipeline, which creates pangenomes directly from whole-genome alignments, and demonstrate its ability to scale to 90 human haplotypes from the Human Pangenome Reference Consortium. The method builds graphs containing all forms of genetic variation while allowing use of current mapping and genotyping tools. We measure the effect of the quality and completeness of reference genomes used for analysis within the pangenomes and show that using the CHM13 reference from the Telomere-to-Telomere Consortium improves the accuracy of our methods. We also demonstrate construction of a Drosophila melanogaster pangenome.
2023-05-25Dictionary learning for integrative, multimodal and scalable single-cell analysisHao Y, Stuart T, Kowalski MH, Choudhary S, Hoffman P, Hartman A, Srivastava A, Molla G, Madad S, Fernandez-Granda C, Satija R.HIVE MC-NYGCMapping single-cell sequencing profiles to comprehensive reference datasets provides a powerful alternative to unsupervised analysis. However, most reference datasets are constructed from single-cell RNA-sequencing data and cannot be used to annotate datasets that do not measure gene expression. Here we introduce 'bridge integration', a method to integrate single-cell datasets across modalities using a multiomic dataset as a molecular bridge. Each cell in the multiomic dataset constitutes an element in a 'dictionary', which is used to reconstruct unimodal datasets and transform them into a shared space. Our procedure accurately integrates transcriptomic data with independent single-cell measurements of chromatin accessibility, histone modifications, DNA methylation and protein levels. Moreover, we demonstrate how dictionary learning can be combined with sketching techniques to improve computational scalability and harmonize 8.6 million human immune cell profiles from sequencing and mass cytometry experiments. Our approach, implemented in version 5 of our Seurat toolkit ( http://www.satijalab.org/seurat ), broadens the utility of single-cell reference datasets and facilitates comparisons across diverse molecular modalities.
2023-05-26Ex Vivo OCT-Based Multimodal Imaging of Human Donor Eyes for Research into Age-Related Macular DegenerationMessinger JD, Brinkmann M, Kimble JA, Berlin A, Freund KB, Grossman GH, Ach T, Curcio CATMC-Vanderbilt (Eye/pancreas)A progression sequence for age-related macular degeneration (AMD) learned from optical coherence tomography (OCT)-based multimodal (MMI) clinical imaging could add prognostic value to laboratory findings. In this work, ex vivo OCT and MMI were applied to human donor eyes prior to retinal tissue sectioning. The eyes were recovered from non-diabetic white donors aged ≥80 years old, with a death-to-preservation time (DtoP) of ≤6 h. The globes were recovered on-site, scored with an 18 mm trephine to facilitate cornea removal, and immersed in buffered 4% paraformaldehyde. Color fundus images were acquired after anterior segment removal with a dissecting scope and an SLR camera using trans-, epi-, and flash illumination at three magnifications. The globes were placed in a buffer within a custom-designed chamber with a 60 diopter lens. They were imaged with spectral domain OCT (30° macula cube, 30 µm spacing, averaging = 25), near-infrared reflectance, 488 nm autofluorescence, and 787 nm autofluorescence. The AMD eyes showed a change in the retinal pigment epithelium (RPE), with drusen or subretinal drusenoid deposits (SDDs), with or without neovascularization, and without evidence of other causes. Between June 2016 and September 2017, 94 right eyes and 90 left eyes were recovered (DtoP: 3.9 ± 1.0 h). Of the 184 eyes, 40.2% had AMD, including early intermediate (22.8%), atrophic (7.6%), and neovascular (9.8%) AMD, and 39.7% had unremarkable maculas. Drusen, SDDs, hyper-reflective foci, atrophy, and fibrovascular scars were identified using OCT. Artifacts included tissue opacification, detachments (bacillary, retinal, RPE, choroidal), foveal cystic change, an undulating RPE, and mechanical damage. To guide the cryo-sectioning, OCT volumes were used to find the fovea and optic nerve head landmarks and specific pathologies. The ex vivo volumes were registered with the in vivo volumes by selecting the reference function for eye tracking. The ex vivo visibility of the pathology seen in vivo depends on the preservation quality. Within 16 months, 75 rapid DtoP donor eyes at all stages of AMD were recovered and staged using clinical MMI methods.
2023-06-03PodoCount: A Robust, Fully Automated, Whole-Slide Podocyte Quantification ToolSanto BA, Govind D, Daneshpajouhnejad P, Yang X, Wang XX, Myakala K, Jones BA, Levi M, Kopp JB, Yoshida T, Niedernhofer LJ, Manthey D, Moon KC, Han SS, Zee J, Rosenberg AZ, Sarder PTMC-UCSDIntroduction: Podocyte depletion is a histomorphologic indicator of glomerular injury and predicts clinical outcomes. Podocyte estimation methods or podometrics are semiquantitative, technically involved, and laborious. Implementation of high-throughput podometrics in experimental and clinical workflows necessitates an automated podometrics pipeline. Recognizing that computational image analysis offers a robust approach to study cell and tissue structure, we developed and validated PodoCount (a computational tool for automated podocyte quantification in immunohistochemically labeled tissues) using a diverse data set. Methods: Whole-slide images (WSIs) of tissues immunostained with a podocyte nuclear marker and periodic acid-Schiff counterstain were acquired. The data set consisted of murine whole kidney sections (n = 135) from 6 disease models and human kidney biopsy specimens from patients with diabetic nephropathy (DN) (n = 45). Within segmented glomeruli, podocytes were extracted and image analysis was applied to compute measures of podocyte depletion and nuclear morphometry. Computational performance evaluation and statistical testing were performed to validate podometric and associated image features. PodoCount was disbursed as an open-source, cloud-based computational tool. Results: PodoCount produced highly accurate podocyte quantification when benchmarked against existing methods. Podocyte nuclear profiles were identified with 0.98 accuracy and segmented with 0.85 sensitivity and 0.99 specificity. Errors in podocyte count were bounded by 1 podocyte per glomerulus. Podocyte-specific image features were found to be significant predictors of disease state, proteinuria, and clinical outcome. Conclusion: PodoCount offers high-performance podocyte quantitation in diverse murine disease models and in human kidney biopsy specimens. Resultant features offer significant correlation with associated metadata and outcome. Our cloud-based tool will provide end users with a standardized approach for automated podometrics from gigapixel-sized WSIs.
2023-06-20The expanding vistas of spatial transcriptomicsTian L, Chen F, Macosko EZRTI-BroadThe formation and maintenance of tissue integrity requires complex, coordinated activities by thousands of genes and their encoded products. Until recently, transcript levels could only be quantified for a few genes in tissues, but advances in DNA sequencing, oligonucleotide synthesis and fluorescence microscopy have enabled the invention of a suite of spatial transcriptomics technologies capable of measuring the expression of many, or all, genes in situ. These technologies have evolved rapidly in sensitivity, multiplexing and throughput. As such, they have enabled the determination of the cell-type architecture of tissues, the querying of cell-cell interactions and the monitoring of molecular interactions between tissue components. The rapidly evolving spatial genomics landscape will enable generalized high-throughput genomic measurements and perturbations to be performed in the context of tissues. These advances will empower hypothesis generation and biological discovery and bridge the worlds of tissue biology and genomics.
2023-06-20Systems biology approaches to unravel lymphocyte subsets and functionKim Y, Greenleaf WJ, Bendall SCTMC-BIDMCSingle-cell technologies have revealed the extensive heterogeneity and complexity of the immune system. Systems biology approaches in immunology have taken advantage of the high-parameter, high-throughput data and analyzed immune cell types in a 'bottom-up' data-driven method. This approach has discovered previously unrecognized cell types and functions. Especially for human immunology, in which experimental manipulations are challenging, systems approach has become a successful means to investigate physiologically relevant contexts. This review focuses on the recent findings in lymphocyte biology, from their development, differentiation into subsets, and heterogeneity in their functions, enabled by these systems approaches. Furthermore, we review examples of the application of findings from systems approach studies and discuss how now to leave the rich dataset in the curse of high dimensionality.
2023-06-20Prospective validation of diffusion-weighted MRI as a biomarker of tumor response and oncologic outcomes in head and neck cancer: Results from an observational biomarker pre-qualification studyJoint Head and Neck Radiotherapy-MRI Development Cooperative; Mohamed ASR, Abusaif A, He R, Wahid KA, Salama V, Youssef S, McDonald BA, Naser M, Ding Y, Salzillo TC, AboBakr MA, Wang J, Lai SY, Fuller CDHIVE IEC-PSCPurpose: To determine DWI parameters associated with tumor response and oncologic outcomes in head and neck (HNC) patients treated with radiotherapy (RT). Methods: HNC patients in a prospective study were included. Patients had MRIs pre-, mid-, and post-RT completion. We used T2-weighted sequences for tumor segmentation which were co-registered to respective DWIs for extraction of apparent diffusion coefficient (ADC) measurements. Treatment response was assessed at mid- and post-RT and was defined as: complete response (CR) vs. non-complete response (non-CR). The Mann-Whitney U test was used to compare ADC between CR and non-CR. Recursive partitioning analysis (RPA) was performed to identify ADC threshold associated with relapse. Cox proportional hazards models were done for clinical vs. clinical and imaging parameters and internal validation was done using bootstrapping technique. Results: Eighty-one patients were included. Median follow-up was 31 months. For patients with post-RT CR, there was a significant increase in mean ADC at mid-RT compared to baseline ((1.8 ± 0.29) × 10-3 mm2/s vs. (1.37 ± 0.22) × 10-3 mm2/s, p < 0.0001), while patients with non-CR had no significant increase (p > 0.05). RPA identified GTV-P delta (Δ)ADCmean < 7% at mid-RT as the most significant parameter associated with worse LC and RFS (p = 0.01). Uni- and multi-variable analysis showed that GTV-P ΔADCmean at mid-RT ≥ 7% was significantly associated with better LC and RFS. The addition of ΔADCmean significantly improved the c-indices of LC and RFS models compared with standard clinical variables (0.85 vs. 0.77 and 0.74 vs. 0.68 for LC and RFS, respectively, p < 0.0001 for both). Conclusion: ΔADCmean at mid-RT is a strong predictor of oncologic outcomes in HNC. Patients with no significant increase of primary tumor ADC at mid-RT are at high risk of disease relapse.
2023-06-23Multi-omic longitudinal study reveals immune correlates of clinical course among hospitalized COVID-19 patientsDiray-Arce J, Fourati S, Doni Jayavelu N, Patel R, Maguire C, Chang AC, Dandekar R, Qi J, Lee BH, van Zalm P, Schroeder A, Chen E, Konstorum A, Brito A, Gygi JP, Kho A, Chen J, Pawar S, Gonzalez-Reiche AS, Hoch A, Milliren CE, Overton JA, Westendorf K, IMPACC Network; Cairns CB, Rouphael N, Bosinger SE, Kim-Schulze S, Krammer F, Rosen L, Grubaugh ND, van Bakel H, Wilson M, Rajan J, Steen H, Eckalbar W, Cotsapas C, Langelier CR, Levy O, Altman MC, Maecker H, Montgomery RR, Haddad EK, Sekaly RP, Esserman D, Ozonoff A, Becker PM, Augustine AD, Guan L, Peters B, Kleinstein SHTMC-FloridaThe IMPACC cohort, composed of >1,000 hospitalized COVID-19 participants, contains five illness trajectory groups (TGs) during acute infection (first 28 days), ranging from milder (TG1-3) to more severe disease course (TG4) and death (TG5). Here, we report deep immunophenotyping, profiling of >15,000 longitudinal blood and nasal samples from 540 participants of the IMPACC cohort, using 14 distinct assays. These unbiased analyses identify cellular and molecular signatures present within 72 h of hospital admission that distinguish moderate from severe and fatal COVID-19 disease. Importantly, cellular and molecular states also distinguish participants with more severe disease that recover or stabilize within 28 days from those that progress to fatal outcomes (TG4 vs. TG5). Furthermore, our longitudinal design reveals that these biologic states display distinct temporal patterns associated with clinical outcomes. Characterizing host immune responses in relation to heterogeneity in disease course may inform clinical prognosis and opportunities for intervention.
2023-06-27Microtechnologies for single-cell and spatial multi-omicsDeng Y, Bai Z, Fan RTTD-YaleSingle-cell omics assays allow the identification of the type, subtype and functional state of a single cell. To put such single-cell data in the context of tissues, spatially resolved omics can be applied to quantify gene expression and regulation in intact tissues at the genome scale. However, to obtain a full picture of gene regulatory networks in a cell, multi-omic assays are required that can assess two or more modalities of omics information. In this Review, we discuss microfabricated systems that can be engineered to isolate, probe, manipulate and process single cells at the micrometre scale for single-cell and spatial multi-omics studies. We outline microchannel-, microarray- and droplet-based microfluidic platforms, examining their application in multimodal cellular profiling at the cellular and subcellular level. Finally, we discuss the key challenges that need to be addressed to advance the translation and commercialization of such microchip-based technologies for fundamental research and medical applications.
2023-06-28Unsupervised cell functional annotation for single-cell RNA-seqLi D, Ding J, Bar-Joseph ZHIVE TC-CMUOne of the first steps in the analysis of single-cell RNA sequencing (scRNA-seq) data is the assignment of cell types. Although a number of supervised methods have been developed for this, in most cases such assignment is performed by first clustering cells in low-dimensional space and then assigning cell types to different clusters. To overcome noise and to improve cell type assignments, we developed UNIFAN, a neural network method that simultaneously clusters and annotates cells using known gene sets. UNIFAN combines both low-dimensional representation for all genes and cell-specific gene set activity scores to determine the clustering. We applied UNIFAN to human and mouse scRNA-seq data sets from several different organs. We show, by using knowledge about gene sets, that UNIFAN greatly outperforms prior methods developed for clustering scRNA-seq data. The gene sets assigned by UNIFAN to different clusters provide strong evidence for the cell type that is represented by this cluster, making annotations easier.
2023-06-29Quantifying radiation in the axillary bed at the site of lymphedema surgical preventionFriedman R, Spiegel DY, Kinney J, Willcox J, Recht A, Singhal DTMC-BIDMCPurpose: Immediate lymphatic reconstruction (ILR) is a procedure known to reduce the risk of lymphedema in patients undergoing axillary lymph node dissection (ALND). However, patients who receive adjuvant radiotherapy are at increased risk of lymphedema. The aim of this study was to quantify the extent of radiation at the site of surgical prevention. Methods: We recently began deploying clips at the site of ILR to identify the site during radiation planning. A retrospective review was performed to identify breast cancer patients who underwent ILR with clip deployment and adjuvant radiation therapy from October 2020 to April 2022. Patients were excluded if they had not completed radiotherapy. The exposure and dose of radiation received by the site was determined and recorded. Results: In a cohort of 11 patients, the site fell within the radiation field in 7 patients (64%) and received a median dose of 4280 cGy. Among these 7 patients, 3 had sites located within tissue considered at risk of oncologic recurrence and the remaining 4 sites received radiation from a tangential field treating the breast or chest wall. The median dose to the ILR site for the 4 patients whose sites were outside the radiation fields was 233 cGy. Conclusion: Our findings suggest that even when the site of surgical prevention was not within the targeted radiation field during treatment planning, it remains susceptible to radiation. Strategies for limiting radiation at this site are needed.
2023-06-29Polygenic prediction of preeclampsia and gestational hypertensionHonigberg MC, Truong B, Khan RR, Xiao B, Bhatta L, Vy HMT, Guerrero RF, Schuermans A, Selvaraj MS, Patel AP, Koyama S, Cho SMJ, Vellarikkal SK, Trinder M, Urbut SM, Gray KJ, Brumpton BM, Patil S, Zöllner S, Antopia MC, Saxena R, Nadkarni GN, Do R, Yan Q, Pe'er I, Verma SS, Gupta RM, Haas DM, Martin HC, van Heel DA, Laisk T, Natarajan PDP-HarvardPreeclampsia and gestational hypertension are common pregnancy complications associated with adverse maternal and child outcomes. Current tools for prediction, prevention and treatment are limited. Here we tested the association of maternal DNA sequence variants with preeclampsia in 20,064 cases and 703,117 control individuals and with gestational hypertension in 11,027 cases and 412,788 control individuals across discovery and follow-up cohorts using multi-ancestry meta-analysis. Altogether, we identified 18 independent loci associated with preeclampsia/eclampsia and/or gestational hypertension, 12 of which are new (for example, MTHFR-CLCN6, WNT3A, NPR3, PGR and RGL3), including two loci (PLCE1 and FURIN) identified in the multitrait analysis. Identified loci highlight the role of natriuretic peptide signaling, angiogenesis, renal glomerular function, trophoblast development and immune dysregulation. We derived genome-wide polygenic risk scores that predicted preeclampsia/eclampsia and gestational hypertension in external cohorts, independent of clinical risk factors, and reclassified eligibility for low-dose aspirin to prevent preeclampsia. Collectively, these findings provide mechanistic insights into the hypertensive disorders of pregnancy and have the potential to advance pregnancy risk stratification.
2023-06-30Deriving spatial features from in situ proteomics imaging to enhance cancer survival analysisDayao MT, Trevino A, Kim H, Ruffalo M, D'Angio HB, Preska R, Duvvuri U, Mayer AT, Bar-Joseph ZHIVE TC-CMUMotivation: Spatial proteomics data have been used to map cell states and improve our understanding of tissue organization. More recently, these methods have been extended to study the impact of such organization on disease progression and patient survival. However, to date, the majority of supervised learning methods utilizing these data types did not take full advantage of the spatial information, impacting their performance and utilization. Results: Taking inspiration from ecology and epidemiology, we developed novel spatial feature extraction methods for use with spatial proteomics data. We used these features to learn prediction models for cancer patient survival. As we show, using the spatial features led to consistent improvement over prior methods that used the spatial proteomics data for the same task. In addition, feature importance analysis revealed new insights about the cell interactions that contribute to patient survival. Availability and implementation: The code for this work can be found at gitlab.com/enable-medicine-public/spatsurv.
2023-07-05Tissue Mass Spectrometry: How Solid Is Our Future?Unsihuay D, Phipps WS, Paulovich AG, Chapman JR, Ducret A, Eberlin LS, Spraggins JM, Goodwin RJATMC-Vanderbilt (Kidney)NA
2023-07-10SCS: cell segmentation for high-resolution spatial transcriptomicsChen H, Li D, Bar-Joseph ZHIVE TC-CMUSpatial transcriptomics promises to greatly improve our understanding of tissue organization and cell-cell interactions. While most current platforms for spatial transcriptomics only offer multi-cellular resolution, with 10-15 cells per spot, recent technologies provide a much denser spot placement leading to subcellular resolution. A key challenge for these newer methods is cell segmentation and the assignment of spots to cells. Traditional image-based segmentation methods are limited and do not make full use of the information profiled by spatial transcriptomics. Here we present subcellular spatial transcriptomics cell segmentation (SCS), which combines imaging data with sequencing data to improve cell segmentation accuracy. SCS assigns spots to cells by adaptively learning the position of each spot relative to the center of its cell using a transformer neural network. SCS was tested on two new subcellular spatial transcriptomics technologies and outperformed traditional image-based segmentation methods. SCS achieved better accuracy, identified more cells and provided more realistic cell size estimation. Subcellular analysis of RNAs using SCS spot assignments provides information on RNA localization and further supports the segmentation results.
2023-07-19Organization of the Human Intestine at Single Cell ResolutionHickey JW, Becker WR, Nevins SA, Horning A, Perez AE, Zhu C, Zhu B, Wei B, Chiu R, Chen DC, Cotter DL, Esplin ED, Weimer AK, Caraccio C, Venkataraaman V, Schürch CM, Black S, Brbić M, Cao K, Chen S, Zhang W, Monte E, Zhang NR, Ma Z, Leskovec J, Zhang Z, Lin S, Longacre T, Plevritis SK, Lin Y, Nolan GP, Greenleaf WJ, Snyder MTMC-StanfordThe intestine is a complex organ that promotes digestion, extracts nutrients, participates in immune surveillance, maintains critical symbiotic relationships with microbiota and affects overall health1. The intesting has a length of over nine metres, along which there are differences in structure and function2. The localization of individual cell types, cell type development trajectories and detailed cell transcriptional programs probably drive these differences in function. Here, to better understand these differences, we evaluated the organization of single cells using multiplexed imaging and single-nucleus RNA and open chromatin assays across eight different intestinal sites from nine donors. Through systematic analyses, we find cell compositions that differ substantially across regions of the intestine and demonstrate the complexity of epithelial subtypes, and find that the same cell types are organized into distinct neighbourhoods and communities, highlighting distinct immunological niches that are present in the intestine. We also map gene regulatory differences in these cells that are suggestive of a regulatory differentiation cascade, and associate intestinal disease heritability with specific cell types. These results describe the complexity of the cell composition, regulation and organization for this organ, and serve as an important reference map for understanding human biology and disease.
2023-07-19Segmentation of human functional tissue units in support of a Human Reference AtlasJain Y, Godwin LL, Ju Y, Sood N, Quardokus EM, Bueckle A, Longacre T, Horning A, Lin Y, Esplin ED, Hickey JW, Snyder MP, Patterson NH, Spraggins JM, Börner KHIVE MC-IUThe Human BioMolecular Atlas Program (HuBMAP) aims to compile a Human Reference Atlas (HRA) for the healthy adult body at the cellular level. Functional tissue units (FTUs), relevant for HRA construction, are of pathobiological significance. Manual segmentation of FTUs does not scale; highly accurate and performant, open-source machine-learning algorithms are needed. We designed and hosted a Kaggle competition that focused on development of such algorithms and 1200 teams from 60 countries participated. We present the competition outcomes and an expanded analysis of the winning algorithms on additional kidney and colon tissue data, and conduct a pilot study to understand spatial location and density of FTUs across the kidney. The top algorithm from the competition, Tom, outperforms other algorithms in the expanded study, while using fewer computational resources. Tom was added to the HuBMAP infrastructure to run kidney FTU segmentation at scale-showcasing the value of Kaggle competitions for advancing research.
2023-07-193D reconstruction of skin and spatial mapping of immune cell density, vascular distance and effects of sun exposure and agingGhose S, Ju Y, McDonough E, Ho J, Karunamurthy A, Chadwick C, Cho S, Rose R, Corwin A, Surrette C, Martinez J, Williams E, Sood A, Al-Kofahi Y, Falo LD Jr, Börner K, Ginty FTMC-GE GlobalMapping the human body at single cell resolution in three dimensions (3D) is important for understanding cellular interactions in context of tissue and organ organization. 2D spatial cell analysis in a single tissue section may be limited by cell numbers and histology. Here we show a workflow for 3D reconstruction of multiplexed sequential tissue sections: MATRICS-A (Multiplexed Image Three-D Reconstruction and Integrated Cell Spatial - Analysis). We demonstrate MATRICS-A in 26 serial sections of fixed skin (stained with 18 biomarkers) from 12 donors aged between 32-72 years. Comparing the 3D reconstructed cellular data with the 2D data, we show significantly shorter distances between immune cells and vascular endothelial cells (56 µm in 3D vs 108 µm in 2D). We also show 10-70% more T cells (total) within 30 µm of a neighboring T helper cell in 3D vs 2D. Distances of p53, DDB2 and Ki67 positive cells to the skin surface were consistent across all ages/sun exposure and largely localized to the lower stratum basale layer of the epidermis. MATRICS-A provides a framework for analysis of 3D spatial cell relationships in healthy and aging organs and could be further extended to diseased organs.
2023-07-19Segmenting functional tissue units across human organs using community-driven development of generalizable machine learning algorithmsBörner KHIVE MC-IUNA  
2023-07-19Anatomical structures, cell types, and biomarkers of the healthy human blood vasculatureBoppana A, Lee S, Malhotra R, Halushka M, Gustilo KS, Quardokus EM, Herr BW 2nd, Börner K, Weber GMHIVE MC-IUMore than 150 scientists from 17 consortia are collaborating on an international project to build a Human Reference Atlas, which maps all 37 trillion cells in the healthy adult human body. The initial release of this atlas provided hierarchical lists of the anatomical structures, cell types, and biomarkers in 11 organs. Here, we describe the methods we used as part of this initiative to build the first open, computer-readable, and comprehensive database of the adult human blood vasculature, called the Human Reference Atlas-Vasculature Common Coordinate Framework (HRA-VCCF). It includes 993 vessels and their branching connections, 10 cell types, and 10 biomarkers. With this paper we are releasing additional details on vessel types and subtypes, branching sequence, anastomoses, portal systems, microvasculature, functional tissue units, mappings to regions vessels supply or drain, geometric properties of vessels, and links to 3D reference objects. Future versions will add variants and connections to the lymph vasculature; and, it will iteratively expand and improve the database as additional experimental data become available through the participating consortia.
2023-07-19A spatially resolved timeline of the human maternal-fetal interfaceGreenbaum S, Averbukh I, Soon E, Rizzuto G, Baranski A, Greenwald NF, Kagel A, Bosse M, Jaswa EG, Khair Z, Kwok S, Warshawsky S, Piyadasa H, Goldston M, Spence A, Miller G, Schwartz M, Graf W, Van Valen D, Winn VD, Hollmann T, Keren L, van de Rijn M, Angelo MTMC-Stanford (Bone marrow)Beginning in the first trimester, fetally derived extravillous trophoblasts (EVTs) invade the uterus and remodel its spiral arteries, transforming them into large, dilated blood vessels. Several mechanisms have been proposed to explain how EVTs coordinate with the maternal decidua to promote a tissue microenvironment conducive to spiral artery remodelling (SAR)1-3. However, it remains a matter of debate regarding which immune and stromal cells participate in these interactions and how this evolves with respect to gestational age. Here we used a multiomics approach, combining the strengths of spatial proteomics and transcriptomics, to construct a spatiotemporal atlas of the human maternal-fetal interface in the first half of pregnancy. We used multiplexed ion beam imaging by time-of-flight and a 37-plex antibody panel to analyse around 500,000 cells and 588 arteries within intact decidua from 66 individuals between 6 and 20 weeks of gestation, integrating this dataset with co-registered transcriptomics profiles. Gestational age substantially influenced the frequency of maternal immune and stromal cells, with tolerogenic subsets expressing CD206, CD163, TIM-3, galectin-9 and IDO-1 becoming increasingly enriched and colocalized at later time points. By contrast, SAR progression preferentially correlated with EVT invasion and was transcriptionally defined by 78 gene ontology pathways exhibiting distinct monotonic and biphasic trends. Last, we developed an integrated model of SAR whereby invasion is accompanied by the upregulation of pro-angiogenic, immunoregulatory EVT programmes that promote interactions with the vascular endothelium while avoiding the activation of maternal immune cells.
2023-07-19Organ Mapping Antibody Panels (OMAPs): A community resource for standardized multiplexed tissue imagingQuardokus EM, Saunders DC, McDonough E, Hickey JW, Werlein C, Surrette C, Rajbhandari P, Casals AM, Tian H, Lowery L, Neumann EK, Björklund F, Neelakantan TV, Croteau J, Wiblin AE, Fisher J, Livengood AJ, Dowell KG, Silverstein JC, Spraggins JM, Pryhuber GS, Deutsch G, Ginty F, Nolan GP, Melov S, Jonigk D, Caldwell MA, Vlachos IS, Muller W, Gehlenborg N, Stockwell BR, Lundberg E, Snyder MP, Germain RN, Camarillo JM, Kelleher NL, Börner K, Radtke AJConsortiumMultiplexed antibody-based imaging enables the detailed characterization of molecular and cellular organization in tissues. Advances in the field now allow high-parameter data collection (>60 targets); however, considerable expertise and capital are needed to construct the antibody panels employed by these methods. Organ mapping antibody panels are community-validated resources that save time and money, increase reproducibility, accelerate discovery and support the construction of a Human Reference Atlas.
2023-07-19An atlas of healthy and injured cell states and niches in the human kidneyLake BB, Menon R, Winfree S, Hu Q, Ferreira RM, Kalhor K, Barwinska D, Otto EA, Ferkowicz M, Diep D, Plongthongkum N, Knoten A, Urata S, Mariani LH, Naik AS, Eddy S, Zhang B, Wu Y, Salamon D, Williams JC, Wang X, Balderrama KS, Hoover PJ, Murray E, Marshall JL, Noel T, Vijayan A, Hartman A, Chen F, Waikar SS, Rosas SE, Wilson FP, Palevsky PM, Kiryluk K, Sedor JR, Toto RD, Parikh CR, Kim EH, Satija R, Greka A, Macosko EZ, Kharchenko PV, Gaut JP, Hodgin JB; KPMP Consortium; Eadon MT, Dagher PC, El-Achkar TM, Zhang K, Kretzler M, Jain SConsortiumUnderstanding kidney disease relies on defining the complexity of cell types and states, their associated molecular profiles and interactions within tissue neighbourhoods1. Here we applied multiple single-cell and single-nucleus assays (>400,000 nuclei or cells) and spatial imaging technologies to a broad spectrum of healthy reference kidneys (45 donors) and diseased kidneys (48 patients). This has provided a high-resolution cellular atlas of 51 main cell types, which include rare and previously undescribed cell populations. The multi-omic approach provides detailed transcriptomic profiles, regulatory factors and spatial localizations spanning the entire kidney. We also define 28 cellular states across nephron segments and interstitium that were altered in kidney injury, encompassing cycling, adaptive (successful or maladaptive repair), transitioning and degenerative states. Molecular signatures permitted the localization of these states within injury neighbourhoods using spatial transcriptomics, while large-scale 3D imaging analysis (around 1.2 million neighbourhoods) provided corresponding linkages to active immune responses. These analyses defined biological pathways that are relevant to injury time-course and niches, including signatures underlying epithelial repair that predicted maladaptive states associated with a decline in kidney function. This integrated multimodal spatial cell atlas of healthy and diseased human kidneys represents a comprehensive benchmark of cellular states, neighbourhoods, outcome-associated signatures and publicly available interactive visualizations.
2023-07-19A spatially anchored transcriptomic atlas of the human kidney papilla identifies significant immune injury in patients with stone diseaseCanela VH, Bowen WS, Ferreira RM, Syed F, Lingeman JE, Sabo AR, Barwinska D, Winfree S, Lake BB, Cheng YH, Gaut JP, Ferkowicz M, LaFavers KA, Zhang K, Coe FL, Worcester E; Kidney Precision Medicine Project; Jain S, Eadon MT, Williams JC Jr, El-Achkar TMConsortiumKidney stone disease causes significant morbidity and increases health care utilization. In this work, we decipher the cellular and molecular niche of the human renal papilla in patients with calcium oxalate (CaOx) stone disease and healthy subjects. In addition to identifying cell types important in papillary physiology, we characterize collecting duct cell subtypes and an undifferentiated epithelial cell type that was more prevalent in stone patients. Despite the focal nature of mineral deposition in nephrolithiasis, we uncover a global injury signature characterized by immune activation, oxidative stress and extracellular matrix remodeling. We also identify the association of MMP7 and MMP9 expression with stone disease and mineral deposition, respectively. MMP7 and MMP9 are significantly increased in the urine of patients with CaOx stone disease, and their levels correlate with disease activity. Our results define the spatial molecular landscape and specific pathways contributing to stone-mediated injury in the human papilla and identify associated urinary biomarkers.
2023-07-19Advances and Prospects for the Human BioMolecular Atlas Program (HuBMAP)Jain S, Pei L, Spraggins JM, Angelo M, Carson JP, Gehlenborg N, Ginty F, Gonçalves JP, Hagood JS, Hickey JW, Kelleher NL, Laurent LC, Lin S, Lin Y, Liu H, Naba A, Nakayasu ES, Qian WJ, Radtke A, Robson P, Stockwell BR, Van de Plas R, Vlachos IS, Zhou M; HuBMAP Consortium; Börner K, Snyder MPConsortiumThe Human BioMolecular Atlas Program (HuBMAP) aims to create a multi-scale spatial atlas of the healthy human body at single-cell resolution by applying advanced technologies and disseminating resources to the community. As the HuBMAP moves past its first phase, creating ontologies, protocols and pipelines, this Perspective introduces the production phase: the generation of reference spatial maps of functional tissue units across many organs from diverse populations and the creation of mapping tools and infrastructure to advance biomedical research.
2023-07-19Multimodal mass spectrometry imaging reveals molecular, cellular and structural organization of mammalian liver at single-cell resolutionPREPRINTTTD-Columbia/Penn StateNA
2023-07-22Non-Linear Lymphatic Anatomy in Breast Cancer Patients Prior to Axillary Lymph Node Dissection: A Risk Factor For Lymphedema DevelopmentKinney JR, Friedman R, Kim E, Tillotson E, Shillue K, Lee BT, Singhal DTMC-BIDMCImmediate lymphatic reconstruction (ILR) at the time of axillary lymph node dissection (ALND) has become increasingly utilized for the prevention of breast cancer related lymphedema. Preoperative indocyanine green (ICG) lymphography is routinely performed prior to an ILR procedure to characterize baseline lymphatic anatomy of the upper extremity. While most patients have linear lymphatic channels visualized on ICG, representing a non-diseased state, some patients demonstrate non-linear patterns. This study aims to determine potential inciting factors that help explain why some patients have non-linear patterns, and what these patterns represent regarding the relative risk of developing postoperative breast cancer related lymphedema in this population. A retrospective review was conducted to identify breast cancer patients who underwent successful ILR with preoperative ICG at our institution from November 2017-June 2022. Among the 248 patients who were identified, 13 (5%) had preoperative non-linear lymphatic anatomy. A history of trauma or surgery of the affected limb and an increasing number of sentinel lymph nodes removed prior to ALND appeared to be risk factors for non-linear lymphatic anatomy. Furthermore, non-linear anatomy in the limb of interest was associated with an increased risk of postoperative lymphedema development. Overall, non-linear lymphatic anatomy on pre-operative ICG lymphography appears to be a risk factor for developing ipsilateral breast cancer-related lymphedema. Guided by the study's findings, when breast cancer patients present with baseline non-linear lymphatic anatomy, our institution has implemented a protocol of prophylactically prescribing compression sleeves immediately following ALND.
2023-07-26Edematous Dermal Thickening on Magnetic Resonance Imaging as a Biomarker for Lymphatic Surgical OutcomesKinney JR, Babapour S, Kim E, Friedman R, Singhal D, Lee BT, Tsai LLTMC-BIDMCBackground and Objectives: One of the surgical treatments for breast cancer-related lymphedema (BCRL) is debulking lipectomy. The aim of this study is to investigate whether dermal thickness could be utilized as an objective indicator of post-operative changes following debulking. Materials and Methods: A retrospective review of BCRL patients who underwent debulking lipectomy was conducted. MRI-based dermal thickness was measured by two separate trained readers at 16 regions of the upper extremity. Pre- and post-operative reduction in dermal thickness was compared across the affected and unaffected (control) arms for each patient. The Wilcoxon rank sum test was used to assess for significant change. Univariate linear regression was used to assess the relationship between dermal thickness reduction and changes to LYMPH-Q scores, L-Dex scores, and relative volume change. Results: Seventeen patients were included in our analysis. There was significant reduction in dermal thickness at 5/16 regions in the affected arm. Dermal thickness change was significantly correlated with LYMPH-Q scores, L-Dex scores, and relative volume change in 2/16 limb compartments. There was predominant dermal thickening in the dorsal compartment of the upper arm and in the ventral and ulnar compartments of the forearm. Conclusions: Dermal thickness shows promising utility in tracking post-operative debulking procedures for breast cancer-related lymphedema. Further studies with larger patient populations and a variety of imaging modalities are required to continue to develop a clinically objective and reproducible method of post-surgical lymphedema staging and monitoring.
2023-07-31KRAS(G12D) drives lepidic adenocarcinoma through stem-cell reprogrammingJuul NH, Yoon JK, Martinez MC, Rishi N, Kazadaeva YI, Morri M, Neff NF, Trope WL, Shrager JB, Sinha R, Desai TJTTD-StanfordMany cancers originate from stem or progenitor cells hijacked by somatic mutations that drive replication, exemplified by adenomatous transformation of pulmonary alveolar epithelial type II (AT2) cells1. Here we demonstrate a different scenario: expression of KRAS(G12D) in differentiated AT1 cells reprograms them slowly and asynchronously back into AT2 stem cells that go on to generate indolent tumours. Like human lepidic adenocarcinoma, the tumour cells slowly spread along alveolar walls in a non-destructive manner and have low ERK activity. We find that AT1 and AT2 cells act as distinct cells of origin and manifest divergent responses to concomitant WNT activation and KRAS(G12D) induction, which accelerates AT2-derived but inhibits AT1-derived adenoma proliferation. Augmentation of ERK activity in KRAS(G12D)-induced AT1 cells increases transformation efficiency, proliferation and progression from lepidic to mixed tumour histology. Overall, we have identified a new cell of origin for lung adenocarcinoma, the AT1 cell, which recapitulates features of human lepidic cancer. In so doing, we also uncover a capacity for oncogenic KRAS to reprogram a differentiated and quiescent cell back into its parent stem cell en route to adenomatous transformation. Our work further reveals that irrespective of a given cancer's current molecular profile and driver oncogene, the cell of origin exerts a pervasive and perduring influence on its subsequent behaviour.
2023-08-02Nanospray Desorption Electrospray Ionization (Nano-DESI) Mass Spectrometry Imaging with High Ion Mobility ResolutionJiang LX, Hernly E, Hu H, Hilger RT, Neuweger H, Yang M, Laskin JTTD-PurdueUntargeted separation of isomeric and isobaric species in mass spectrometry imaging (MSI) is challenging. The combination of ion mobility spectrometry (IMS) with MSI has emerged as an effective strategy for differentiating isomeric and isobaric species, which substantially enhances the molecular coverage and specificity of MSI experiments. In this study, we have implemented nanospray desorption electrospray ionization (nano-DESI) MSI on a trapped ion mobility spectrometry (TIMS) mass spectrometer. A new nano-DESI source was constructed, and a specially designed inlet extension was fabricated to accommodate the new source. The nano-DESI-TIMS-MSI platform was evaluated by imaging mouse brain tissue sections. We achieved high ion mobility resolution by utilizing three narrow mobility scan windows that covered the majority of the lipid molecules. Notably, the mobility resolution reaching up to 300 in this study is much higher than the resolution obtained in our previous study using drift tube IMS. High-resolution TIMS successfully separated lipid isomers and isobars, revealing their distinct localizations in tissue samples. Our results further demonstrate the power of high-mobility-resolution IMS for unraveling the complexity of biomolecular mixtures analyzed in MSI experiments.
2023-08-15Integrated single-cell chromatin and transcriptomic analyses of human scalp identify gene-regulatory programs and critical cell types for hair and skin diseasesOber-Reynolds B, Wang C, Ko JM, Rios EJ, Aasi SZ, Davis MM, Oro AE, Greenleaf WJTMC-StanfordGenome-wide association studies have identified many loci associated with hair and skin disease, but identification of causal variants requires deciphering of gene-regulatory networks in relevant cell types. We generated matched single-cell chromatin profiles and transcriptomes from scalp tissue from healthy controls and patients with alopecia areata, identifying diverse cell types of the hair follicle niche. By interrogating these datasets at multiple levels of cellular resolution, we infer 50-100% more enhancer-gene links than previous approaches and show that aggregate enhancer accessibility for highly regulated genes predicts expression. We use these gene-regulatory maps to prioritize cell types, genes and causal variants implicated in the pathobiology of androgenetic alopecia (AGA), eczema and other complex traits. AGA genome-wide association studies signals are enriched in dermal papilla regulatory regions, supporting the role of these cells as drivers of AGA pathogenesis. Finally, we train machine learning models to nominate single-nucleotide polymorphisms that affect gene expression through disruption of transcription factor binding, predicting candidate functional single-nucleotide polymorphism for AGA and eczema.
2023-08-15Proteome Mapping of the Human Pancreatic Islet Microenvironment Reveals Endocrine-Exocrine Signaling Sphere of InfluenceGosline SJC, Veličković M, Pino JC, Day LZ, Attah IK, Swensen AC, Danna V, Posso C, Rodland KD, Chen J, Matthews CE, Campbell-Thompson M, Laskin J, Burnum-Johnson K, Zhu Y, Piehowski PDTMC-PNNLThe need for a clinically accessible method with the ability to match protein activity within heterogeneous tissues is currently unmet by existing technologies. Our proteomics sample preparation platform, named microPOTS (Microdroplet Processing in One pot for Trace Samples), can be used to measure relative protein abundance in micron-scale samples alongside the spatial location of each measurement, thereby tying biologically interesting proteins and pathways to distinct regions. However, given the smaller pixel/voxel number and amount of tissue measured, standard mass spectrometric analysis pipelines have proven inadequate. Here we describe how existing computational approaches can be adapted to focus on the specific biological questions asked in spatial proteomics experiments. We apply this approach to present an unbiased characterization of the human islet microenvironment comprising the entire complex array of cell types involved while maintaining spatial information and the degree of the islet's sphere of influence. We identify specific functional activity unique to the pancreatic islet cells and demonstrate how far their signature can be detected in the adjacent tissue. Our results show that we can distinguish pancreatic islet cells from the neighboring exocrine tissue environment, recapitulate known biological functions of islet cells, and identify a spatial gradient in the expression of RNA processing proteins within the islet microenvironment.
2023-08-17Predicting transcriptional outcomes of novel multigene perturbations with GEARSRoohani Y, Huang K, Leskovec JTMC-StanfordUnderstanding cellular responses to genetic perturbation is central to numerous biomedical applications, from identifying genetic interactions involved in cancer to developing methods for regenerative medicine. However, the combinatorial explosion in the number of possible multigene perturbations severely limits experimental interrogation. Here, we present graph-enhanced gene activation and repression simulator (GEARS), a method that integrates deep learning with a knowledge graph of gene-gene relationships to predict transcriptional responses to both single and multigene perturbations using single-cell RNA-sequencing data from perturbational screens. GEARS is able to predict outcomes of perturbing combinations consisting of genes that were never experimentally perturbed. GEARS exhibited 40% higher precision than existing approaches in predicting four distinct genetic interaction subtypes in a combinatorial perturbation screen and identified the strongest interactions twice as well as prior approaches. Overall, GEARS can predict phenotypically distinct effects of multigene perturbations and thus guide the design of perturbational experiments.
2023-08-18Systematic Sampling of the Female Reproductive System for Molecular CharacterizationFisher SA, Grijalva M, Guo R, Johnston SA, Laurent LC, Nguyen H, Renz J, Rosario JG, Rudich S, Gregory BD, Kim J, O'Neill KTMC-UPennAs part of the National Institutes of Health Human BioMolecular Atlas Program to develop a global platform to map the 37 trillion cells in the adult human body, we are generating a comprehensive molecular characterization of the female reproductive system. Data gathered from multiple single-cell/single-nucleus and spatial molecular assays will be used to build a 3D molecular atlas. Herein, we describe our multistep protocol, beginning with an optimized organ procurement workflow that maintains functional characteristics of the uterus, ovaries, and fallopian tubes by perfusing these organs with preservation solution. We have also developed a structured tissue sampling procedure that retains information on individual-level anatomic, physiologic, and individual diversity of the female reproductive system, toward full exploration of the function and structure of female reproductive cells. © 2023 Wiley Periodicals LLC. Basic Protocol 1: Preparation and preservation of the female reproductive system (ovaries, fallopian tubes, and uterus) prior to procurement Basic Protocol 2: Removal of the female reproductive system en bloc Basic Protocol 3: Postsurgical dissection of ovaries Basic Protocol 4: Postsurgical dissection of fallopian tubes Basic Protocol 5: Postsurgical dissection of cervix Basic Protocol 6: Postsurgical dissection of uterine body Support Protocol 1: OCT-embedded tissue protocol Support Protocol 2: Tissue fixation protocol Support Protocol 3: Snap-frozen tissue protocol Basic Protocol 7: Tissue slice preparation for Visium analysis Support Protocol 4: Hematoxylin and eosin staining for 10X Visium imaging Basic Protocol 8: Manual tissue dissociation for Multiome analysis Basic Protocol 9: Tissue dissociation for Multiome analysis using S2 Singulator.
2023-08-29Proteome Landscapes of Human Hepatocellular Carcinoma and Intrahepatic CholangiocarcinomaYi X, Zhu J, Liu W, Peng L, Lu C, Sun P, Huang L, Nie X, Huang S, Guo T, Zhu YTTD-PurdueLiver cancer is among the top leading causes of cancer mortality worldwide. Particularly, hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (CCA) have been extensively investigated from the aspect of tumor biology. However, a comprehensive and systematic understanding of the molecular characteristics of HCC and CCA remains absent. Here, we characterized the proteome landscapes of HCC and CCA using the data-independent acquisition (DIA) mass spectrometry (MS) method. By comparing the quantitative proteomes of HCC and CCA, we found several differences between the two cancer types. In particular, we found an abnormal lipid metabolism in HCC and activated extracellular matrix-related pathways in CCA. We next developed a three-protein classifier to distinguish CCA from HCC, achieving an area under the curve (AUC) of 0.92, and an accuracy of 90% in an independent validation cohort of 51 patients. The distinct molecular characteristics of HCC and CCA presented in this study provide new insights into the tumor biology of these two major important primary liver cancers. Our findings may help develop more efficient diagnostic approaches and new targeted drug treatments.
2023-09-01Matrisome AnalyzeR - a suite of tools to annotate and quantify ECM molecules in big datasets across organismsPetrov PB, Considine JM, Izzi V, Naba ADP-IllinoisThe extracellular matrix (ECM) is a complex meshwork of proteins that forms the scaffold of all tissues in multicellular organisms. It plays crucial roles in all aspects of life - from orchestrating cell migration during development, to supporting tissue repair. It also plays critical roles in the etiology or progression of diseases. To study this compartment, we have previously defined the compendium of all genes encoding ECM and ECM-associated proteins for multiple organisms. We termed this compendium the 'matrisome' and further classified matrisome components into different structural or functional categories. This nomenclature is now largely adopted by the research community to annotate '-omics' datasets and has contributed to advance both fundamental and translational ECM research. Here, we report the development of Matrisome AnalyzeR, a suite of tools including a web-based application and an R package. The web application can be used by anyone interested in annotating, classifying and tabulating matrisome molecules in large datasets without requiring programming knowledge. The companion R package is available to more experienced users, interested in processing larger datasets or in additional data visualization options.
2023-09-06Dynamic Glycoprotein Hyposialylation Promotes Chemotherapy Evasion and Metastatic Seeding of Quiescent Circulating Tumor Cell Clusters in Breast CancerDashzeveg NK, Jia Y, Zhang Y, Gerratana L, Patel P, Shajahan A, Dandar T, Ramos EK, Almubarak HF, Adorno-Cruz V, Taftaf R, Schuster EJ, Scholten D, Sokolowski MT, Reduzzi C, El-Shennawy L, Hoffmann AD, Manai M, Zhang Q, D'Amico P, Azadi P, Colley KJ, Platanias LC, Shah AN, Gradishar WJ, Cristofanilli M, Muller WA, Cobb BA, Liu H.TTD-PNNL/NorthwesternMost circulating tumor cells (CTC) are detected as single cells, whereas a small proportion of CTCs in multicellular clusters with stemness properties possess 20- to 100-times higher metastatic propensity than the single cells. Here we report that CTC dynamics in both singles and clusters in response to therapies predict overall survival for breast cancer. Chemotherapy-evasive CTC clusters are relatively quiescent with a specific loss of ST6GAL1-catalyzed α2,6-sialylation in glycoproteins. Dynamic hyposialylation in CTCs or deficiency of ST6GAL1 promotes cluster formation for metastatic seeding and enables cellular quiescence to evade paclitaxel treatment in breast cancer. Glycoproteomic analysis reveals newly identified protein substrates of ST6GAL1, such as adhesion or stemness markers PODXL, ICAM1, ECE1, ALCAM1, CD97, and CD44, contributing to CTC clustering (aggregation) and metastatic seeding. As a proof of concept, neutralizing antibodies against one newly identified contributor, PODXL, inhibit CTC cluster formation and lung metastasis associated with paclitaxel treatment for triple-negative breast cancer. Significance: This study discovers that dynamic loss of terminal sialylation in glycoproteins of CTC clusters contributes to the fate of cellular dormancy, advantageous evasion to chemotherapy, and enhanced metastatic seeding. It identifies PODXL as a glycoprotein substrate of ST6GAL1 and a candidate target to counter chemoevasion-associated metastasis of quiescent tumor cells. This article is featured in Selected Articles from This Issue, p. 1949.
2023-09-07Integration of spatial and single-cell data across modalities with weakly linked featuresNolan G, Ma Z, Zhang NTMC-StanfordAlthough single-cell and spatial sequencing methods enable simultaneous measurement of more than one biological modality, no technology can capture all modalities within the same cell. For current data integration methods, the feasibility of cross-modal integration relies on the existence of highly correlated, a priori 'linked' features. We describe matching X-modality via fuzzy smoothed embedding (MaxFuse), a cross-modal data integration method that, through iterative coembedding, data smoothing and cell matching, uses all information in each modality to obtain high-quality integration even when features are weakly linked. MaxFuse is modality-agnostic and demonstrates high robustness and accuracy in the weak linkage scenario, achieving 20~70% relative improvement over existing methods under key evaluation metrics on benchmarking datasets. A prototypical example of weak linkage is the integration of spatial proteomic data with single-cell sequencing data. On two example analyses of this type, MaxFuse enabled the spatial consolidation of proteomic, transcriptomic and epigenomic information at single-cell resolution on the same tissue section.
2023-09-08Segmentation quality assessment by automated detection of erroneous surface regions in medical imagesZaman FA, Zhang L, Zhang H, Sonka M, Wu XTMC-CHOPDespite the advancement in deep learning-based semantic segmentation methods, which have achieved accuracy levels of field experts in many computer vision applications, the same general approaches may frequently fail in 3D medical image segmentation due to complex tissue structures, noisy acquisition, disease-related pathologies, as well as the lack of sufficiently large datasets with associated annotations. For expeditious diagnosis and quantitative image analysis in large-scale clinical trials, there is a compelling need to predict segmentation quality without ground truth. In this paper, we propose a deep learning framework to locate erroneous regions on the boundary surfaces of segmented objects for quality control and assessment of segmentation. A Convolutional Neural Network (CNN) is explored to learn the boundary related image features of multi-objects that can be used to identify location-specific inaccurate segmentation. The predicted error locations can facilitate efficient user interaction for interactive image segmentation (IIS). We evaluated the proposed method on two data sets: Osteoarthritis Initiative (OAI) 3D knee MRI and 3D calf muscle MRI. The average sensitivity scores of 0.95 and 0.92, and the average positive predictive values of 0.78 and 0.91 were achieved, respectively, for erroneous surface region detection of knee cartilage segmentation and calf muscle segmentation. Our experiment demonstrated promising performance of the proposed method for segmentation quality assessment by automated detection of erroneous surface regions in medical images.
2023-09-11OME-Zarr: a cloud-optimized bioimaging file format with international community supportMoore J, Basurto-Lozada D, Besson S, Bogovic J, Bragantini J, Brown EM, Burel JM, Casas Moreno X, de Medeiros G, Diel EE, Gault D, Ghosh SS, Gold I, Halchenko YO, Hartley M, Horsfall D, Keller MS, Kittisopikul M, Kovacs G, Küpcü Yoldaş A, Kyoda K, le Tournoulx de la Villegeorges A, Li T, Liberali P, Lindner D, Linkert M, Lüthi J, Maitin-Shepard J, Manz T, Marconato L, McCormick M, Lange M, Mohamed K, Moore W, Norlin N, Ouyang W, Özdemir B, Palla G, Pape C, Pelkmans L, Pietzsch T, Preibisch S, Prete M, Rzepka N, Samee S, Schaub N, Sidky H, Solak AC, Stirling DR, Striebel J, Tischer C, Toloudis D, Virshup I, Walczysko P, Watson AM, Weisbart E, Wong F, Yamauchi KA, Bayraktar O, Cimini BA, Gehlenborg N, Haniffa M, Hotaling N, Onami S, Royer LA, Saalfeld S, Stegle O, Theis FJ, Swedlow JRHIVE TC-HarvardA growing community is constructing a next-generation file format (NGFF) for bioimaging to overcome problems of scalability and heterogeneity. Organized by the Open Microscopy Environment (OME), individuals and institutes across diverse modalities facing these problems have designed a format specification process (OME-NGFF) to address these needs. This paper brings together a wide range of those community members to describe the cloud-optimized format itself-OME-Zarr-along with tools and data resources available today to increase FAIR access and remove barriers in the scientific process. The current momentum offers an opportunity to unify a key component of the bioimaging domain-the file format that underlies so many personal, institutional, and global data management and analysis tasks.
2023-09-11Transcriptomic profiling of tissue environments critical for post-embryonic patterning and morphogenesis of zebrafish skinAman AJ, Saunders LM, Carr AA, Srivatasan S, Eberhard C, Carrington B, Watkins-Chow D, Pavan WJ, Trapnell C, Parichy DMTMC-Cal TechPigment patterns and skin appendages are prominent features of vertebrate skin. In zebrafish, regularly patterned pigment stripes and an array of calcified scales form simultaneously in the skin during post-embryonic development. Understanding the mechanisms that regulate stripe patterning and scale morphogenesis may lead to the discovery of fundamental mechanisms that govern the development of animal form. To learn about cell types and signaling interactions that govern skin patterning and morphogenesis, we generated and analyzed single-cell transcriptomes of skin from wild-type fish as well as fish having genetic or transgenically induced defects in squamation or pigmentation. These data reveal a previously undescribed population of epidermal cells that express transcripts encoding enamel matrix proteins, suggest hormonal control of epithelial-mesenchymal signaling, clarify the signaling network that governs scale papillae development, and identify a critical role for the hypodermis in supporting pigment cell development. Additionally, these comprehensive single-cell transcriptomic data representing skin phenotypes of biomedical relevance should provide a useful resource for accelerating the discovery of mechanisms that govern skin development and homeostasis.
2023-09-11Semantic-Aware Contrastive Learning for Multi-Object Medical Image SegmentationLee HH, Tang Y, Yang Q, Yu X, Cai LY, Remedios LW, Bao S, Landman BA, Huo Y.TMC-Vanderbilt (Kidney)Medical image segmentation, or computing voxel-wise semantic masks, is a fundamental yet challenging task in medical imaging domain. To increase the ability of encoder-decoder neural networks to perform this task across large clinical cohorts, contrastive learning provides an opportunity to stabilize model initialization and enhances downstream tasks performance without ground-truth voxel-wise labels. However, multiple target objects with different semantic meanings and contrast level may exist in a single image, which poses a problem for adapting traditional contrastive learning methods from prevalent "image-level classification" to "pixel-level segmentation". In this article, we propose a simple semantic-aware contrastive learning approach leveraging attention masks and image-wise labels to advance multi-object semantic segmentation. Briefly, we embed different semantic objects to different clusters rather than the traditional image-level embeddings. We evaluate our proposed method on a multi-organ medical image segmentation task with both in-house data and MICCAI Challenge 2015 BTCV datasets. Compared with current state-of-the-art training strategies, our proposed pipeline yields a substantial improvement of 5.53% and 6.09% on Dice score for both medical image segmentation cohorts respectively (p-value 0.01). The performance of the proposed method is further assessed on external medical image cohort via MICCAI Challenge FLARE 2021 dataset, and achieves a substantial improvement from Dice 0.922 to 0.933 (p-value 0.01).
2023-09-14Scalable Nanopore sequencing of human genomes provides a comprehensive view of haplotype-resolved variation and methylationKolmogorov M, Billingsley KJ, Mastoras M, Meredith M, Monlong J, Lorig-Roach R, Asri M, Alvarez Jerez P, Malik L, Dewan R, Reed X, Genner RM, Daida K, Behera S, Shafin K, Pesout T, Prabakaran J, Carnevali P, Yang J, Rhie A, Scholz SW, Traynor BJ, Miga KH, Jain M, Timp W, Phillippy AM, Chaisson M, Sedlazeck FJ, Blauwendraat C, Paten BHIVE TC-CMULong-read sequencing technologies substantially overcome the limitations of short-reads but have not been considered as a feasible replacement for population-scale projects, being a combination of too expensive, not scalable enough or too error-prone. Here we develop an efficient and scalable wet lab and computational protocol, Napu, for Oxford Nanopore Technologies long-read sequencing that seeks to address those limitations. We applied our protocol to cell lines and brain tissue samples as part of a pilot project for the National Institutes of Health Center for Alzheimer's and Related Dementias. Using a single PromethION flow cell, we can detect single nucleotide polymorphisms with F1-score comparable to Illumina short-read sequencing. Small indel calling remains difficult within homopolymers and tandem repeats, but achieves good concordance to Illumina indel calls elsewhere. Further, we can discover structural variants with F1-score on par with state-of-the-art de novo assembly methods. Our protocol phases small and structural variants at megabase scales and produces highly accurate, haplotype-specific methylation calls.
2023-09-15Early cancer detection by SERS spectroscopy and machine learningShi L, Li Y, Li ZTMC-URMCA new approach for early detection of multiple cancers is presented by integrating SERS spectroscopy of serum molecular fingerprints and machine learning.
2023-09-15Rapid Setup of Tissue Microarray and Tiled Area Imaging on the Multiplexed Ion Beam Imaging Microscope using the Tile/SED/Array InterfacePiyadasa H, Oberlton B, Kong A, Camacho Fullaway C, Reddy Varra S, Sowers C, Tsai AGTMC-Stanford (Bone marrow)Multiplexed ion beam imaging (MIBI) is a next-generation mass spectrometry-based microscopy technique that generates 40+ plex images of protein expression in histologic tissues, enabling detailed dissection of cellular phenotypes and histoarchitectural organization. A key bottleneck in operation occurs when users select the physical locations on the tissue for imaging. As the scale and complexity of MIBI experiments have increased, the manufacturer-provided interface and third-party tools have become increasingly unwieldy for imaging large tissue microarrays and tiled tissue areas. Thus, a web-based, interactive, what-you-see-is-what-you-get (WYSIWYG) graphical interface layer - the tile/SED/array Interface (TSAI) - was developed for users to set imaging locations using familiar and intuitive mouse gestures such as drag-and-drop, click-and-drag, and polygon drawing. Written according to web standards already built into modern web browsers, it requires no installation of external programs, extensions, or compilers. Of interest to the hundreds of current MIBI users, this interface dramatically simplifies and accelerates the setup of large, complex MIBI runs.
2023-09-19Surgical management of lymphedema: Does a microsurgeon's bias exist?Friedman R, Ismail Aly ME, Singhal DTMC-BIDMCN/A
2023-09-19Prospective on Imaging Mass Spectrometry in Clinical DiagnosticsMoore JL, Patterson NH, Norris JL, Caprioli RMTMC-Vanderbilt (Kidney)Imaging mass spectrometry (IMS) is a molecular technology utilized for spatially driven research, providing molecular maps from tissue sections. This article reviews matrix-assisted laser desorption ionization (MALDI) IMS and its progress as a primary tool in the clinical laboratory. MALDI mass spectrometry has been used to classify bacteria and perform other bulk analyses for plate-based assays for many years. However, the clinical application of spatial data within a tissue biopsy for diagnoses and prognoses is still an emerging opportunity in molecular diagnostics. This work considers spatially driven mass spectrometry approaches for clinical diagnostics and addresses aspects of new imaging-based assays that include analyte selection, quality control/assurance metrics, data reproducibility, data classification, and data scoring. It is necessary to implement these tasks for the rigorous translation of IMS to the clinical laboratory; however, this requires detailed standardized protocols for introducing IMS into the clinical laboratory to deliver reliable and reproducible results that inform and guide patient care.
2023-09-21Multimodal single-cell datasets characterize antigen-specific CD8+ T cells across SARS-CoV-2 vaccination and infectionZhang B, Upadhyay R, Hao Y, Samanovic MI, Herati RS, Blair JD, Axelrad J, Mulligan MJ, Littman DR, Satija RHIVE MC-NYGCThe immune response to SARS-CoV-2 antigen after infection or vaccination is defined by the durable production of antibodies and T cells. Population-based monitoring typically focuses on antibody titer, but there is a need for improved characterization and quantification of T cell responses. Here, we used multimodal sequencing technologies to perform a longitudinal analysis of circulating human leukocytes collected before and after immunization with the mRNA vaccine BNT162b2. Our data indicated distinct subpopulations of CD8+ T cells, which reliably appeared 28 days after prime vaccination. Using a suite of cross-modality integration tools, we defined their transcriptome, accessible chromatin landscape and immunophenotype, and we identified unique biomarkers within each modality. We further showed that this vaccine-induced population was SARS-CoV-2 antigen-specific and capable of rapid clonal expansion. Moreover, we identified these CD8+ T cell populations in scRNA-seq datasets from COVID-19 patients and found that their relative frequency and differentiation outcomes were predictive of subsequent clinical outcomes.
2023-09-28Navigating the kidney organoid: insights into assessment and enhancement of nephron functionTabibzadeh N, Satlin LM, Jain S, Morizane RTMC-WUSTLKidney organoids are three-dimensional structures generated from pluripotent stem cells (PSCs) that are capable of recapitulating the major structures of mammalian kidneys. As this technology is expected to be a promising tool for studying renal biology, drug discovery, and regenerative medicine, the functional capacity of kidney organoids has emerged as a critical question in the field. Kidney organoids produced using several protocols harbor key structures of native kidneys. Here we review the current state, recent advances, and future challenges in the functional characterization of kidney organoids, strategies to accelerate and enhance kidney organoid functions, and access to PSC resources to advance organoid research. The strategies to construct physiologically relevant kidney organoids include the use of organ-on-a-chip technologies that integrate fluid circulation and improve organoid maturation. These approaches result in increased expression of the major tubular transporters and elements of mechanosensory signaling pathways suggestive of improved functionality. Nevertheless, continuous efforts remain crucial to create kidney tissue that more faithfully replicates physiological conditions for future applications in kidney regeneration medicine and their ethical use in patient care.
2023-10-02The technological landscape and applications of single-cell multi-omicsBaysoy A, Bai Z, Satija R, Fan RTTD-YaleSingle-cell multi-omics technologies and methods characterize cell states and activities by simultaneously integrating various single-modality omics methods that profile the transcriptome, genome, epigenome, epitranscriptome, proteome, metabolome and other (emerging) omics. Collectively, these methods are revolutionizing molecular cell biology research. In this comprehensive Review, we discuss established multi-omics technologies as well as cutting-edge and state-of-the-art methods in the field. We discuss how multi-omics technologies have been adapted and improved over the past decade using a framework characterized by optimization of throughput and resolution, modality integration, uniqueness and accuracy, and we also discuss multi-omics limitations. We highlight the impact that single-cell multi-omics technologies have had in cell lineage tracing, tissue-specific and cell-specific atlas production, tumour immunology and cancer genetics, and in mapping of cellular spatial information in fundamental and translational research. Finally, we discuss bioinformatics tools that have been developed to link different omics modalities and elucidate functionality through the use of better mathematical modelling and computational methods.
2023-10-09High-plex protein and whole transcriptome co-mapping at cellular resolution with spatial CITE-seqLiu Y, DiStasio M, Su G, Asashima H, Enninful A, Qin X, Deng Y, Nam J, Gao F, Bordignon P, Cassano M, Tomayko M, Xu M, Halene S, Craft JE, Hafler D, Fan RTTD-YaleIn this study, we extended co-indexing of transcriptomes and epitopes (CITE) to the spatial dimension and demonstrated high-plex protein and whole transcriptome co-mapping. We profiled 189 proteins and whole transcriptome in multiple mouse tissue types with spatial CITE sequencing and then further applied the method to measure 273 proteins and transcriptome in human tissues, revealing spatially distinct germinal center reactions in tonsil and early immune activation in skin at the Coronavirus Disease 2019 mRNA vaccine injection site.
2023-10-14Potential and risks of artificial intelligence models: Common in medicine practice and special in pediatric urologyWen Y, Di HTMC-StanfordNA
2023-10-23High-resolution integrated microfluidic probe for mass spectrometry imaging of biological tissuesLi X, Hu H, Laskin JTTD-PurdueNanospray desorption electrospray ionization (nano-DESI) is an ambient ionization technique that enables molecular imaging of biological samples with high spatial resolution. We have recently developed an integrated microfluidic probe (iMFP) for nano-DESI mass spectrometry imaging (MSI) that significantly enhances the robustness of the technique. In this study, we designed a new probe that enables imaging of biological samples with high spatial resolution. The new probe design features smaller primary and spray channels and an entirely new configuration of the sampling port that enables robust imaging of tissues with a spatial resolution of 8-10 μm. We demonstrate the spatial resolution, sensitivity, durability, and throughput of the iMFP by imaging mouse uterine and brain tissue sections. The robustness of the high-resolution iMFP allowed us to perform first imaging experiments with both high spatial resolution and high throughput, which is particularly advantageous for high-resolution imaging of large tissue sections of interest to most MSI applications. Overall, the new probe design opens opportunities for mapping of biomolecules in biological samples with high throughput and cellular resolution, which is important for understanding biological systems.
2023-11-14Dimension-agnostic and granularity-based spatially variable gene identification using BSPWang J, Li J, Kramer ST, Su L, Chang Y, Xu C, Eadon MT, Kiryluk K, Ma Q, Xu DTMC-WUSTLIdentifying spatially variable genes (SVGs) is critical in linking molecular cell functions with tissue phenotypes. Spatially resolved transcriptomics captures cellular-level gene expression with corresponding spatial coordinates in two or three dimensions and can be used to infer SVGs effectively. However, current computational methods may not achieve reliable results and often cannot handle three-dimensional spatial transcriptomic data. Here we introduce BSP (big-small patch), a non-parametric model by comparing gene expression pattens at two spatial granularities to identify SVGs from two or three-dimensional spatial transcriptomics data in a fast and robust manner. This method has been extensively tested in simulations, demonstrating superior accuracy, robustness, and high efficiency. BSP is further validated by substantiated biological discoveries in cancer, neural science, rheumatoid arthritis, and kidney studies with various types of spatial transcriptomics technologies.
2023-12-04Evidence for lung barrier regeneration by differentiation prior to binucleated and stem cell divisionGuild J, Juul NH, Andalon A, Taenaka H, Coffey RJ, Matthay MA, Desai TJTTD-StanfordWith each breath, oxygen diffuses across remarkably thin alveolar type I (AT1) cells into underlying capillaries. Interspersed cuboidal AT2 cells produce surfactant and act as stem cells. Even transient disruption of this delicate barrier can promote capillary leak. Here, we selectively ablated AT1 cells, which uncovered rapid AT2 cell flattening with near-continuous barrier preservation, culminating in AT1 differentiation. Proliferation subsequently restored depleted AT2 cells in two phases, mitosis of binucleated AT2 cells followed by replication of mononucleated AT2 cells. M phase entry of binucleated and S phase entry of mononucleated cells were both triggered by AT1-produced hbEGF signaling via EGFR to Wnt-active AT2 cells. Repeated AT1 cell killing elicited exuberant AT2 proliferation, generating aberrant daughter cells that ceased surfactant function yet failed to achieve AT1 differentiation. This hyperplasia eventually resolved, yielding normal-appearing alveoli. Overall, this specialized regenerative program confers a delicate simple epithelium with functional resiliency on par with the physical durability of thicker, pseudostratified, or stratified epithelia.
2023-12-12Advances in Imaging Mass Spectrometry for Biomedical and Clinical ResearchDjambazova KV, van Ardenne JM, Spraggins JMTMC-Vanderbilt (Kidney)Imaging mass spectrometry (IMS) allows for the untargeted mapping of biomolecules directly from tissue sections. This technology is increasingly integrated into biomedical and clinical research environments to supplement traditional microscopy and provide molecular context for tissue imaging. IMS has widespread clinical applicability in the fields of oncology, dermatology, microbiology, and others. This review summarizes the two most widely employed IMS technologies, matrix-assisted laser desorption/ionization (MALDI) and desorption electrospray ionization (DESI), and covers technological advancements, including efforts to increase spatial resolution, specificity, and throughput. We also highlight recent biomedical applications of IMS, primarily focusing on disease diagnosis, classification, and subtyping.
2023-12-15Spatial pharmacology using mass spectrometry imagingRajbhandari P, Neelakantan TV, Hosny N, Stockwell BRTTD-Columbia/Penn StateThe emerging and powerful field of spatial pharmacology can map the spatial distribution of drugs and their metabolites, as well as their effects on endogenous biomolecules including metabolites, lipids, proteins, peptides, and glycans, without the need for labeling. This is enabled by mass spectrometry imaging (MSI) that provides previously inaccessible information in diverse phases of drug discovery and development. We provide a perspective on how MSI technologies and computational tools can be implemented to reveal quantitative spatial drug pharmacokinetics and toxicology, tissue subtyping, and associated biomarkers. We also highlight the emerging potential of comprehensive spatial pharmacology through integration of multimodal MSI data with other spatial technologies. Finally, we describe how to overcome challenges including improving reproducibility and compound annotation to generate robust conclusions that will improve drug discovery and development processes.
2024-01-09Slide-tags enables single-nucleus barcoding for multimodal spatial genomicsRussell AJC, Weir JA, Nadaf NM, Shabet M, Kumar V, Kambhampati S, Raichur R, Marrero GJ, Liu S, Balderrama KS, Vanderburg CR, Shanmugam V, Tian L, Iorgulescu JB, Yoon CH, Wu CJ, Macosko EZ, Chen FRTI-BroadRecent technological innovations have enabled the high-throughput quantification of gene expression and epigenetic regulation within individual cells, transforming our understanding of how complex tissues are constructed1-6. However, missing from these measurements is the ability to routinely and easily spatially localize these profiled cells. We developed a strategy, Slide-tags, in which single nuclei within an intact tissue section are tagged with spatial barcode oligonucleotides derived from DNA-barcoded beads with known positions. These tagged nuclei can then be used as an input into a wide variety of single-nucleus profiling assays. Application of Slide-tags to the mouse hippocampus positioned nuclei at less than 10 μm spatial resolution and delivered whole-transcriptome data that are indistinguishable in quality from ordinary single-nucleus RNA-sequencing data. To demonstrate that Slide-tags can be applied to a wide variety of human tissues, we performed the assay on brain, tonsil and melanoma. We revealed cell-type-specific spatially varying gene expression across cortical layers and spatially contextualized receptor-ligand interactions driving B cell maturation in lymphoid tissue. A major benefit of Slide-tags is that it is easily adaptable to almost any single-cell measurement technology. As a proof of principle, we performed multiomic measurements of open chromatin, RNA and T cell receptor (TCR) sequences in the same cells from metastatic melanoma, identifying transcription factor motifs driving cancer cell state transitions in spatially distinct microenvironments. Slide-tags offers a universal platform for importing the compendium of established single-cell measurements into the spatial genomics repertoire.
2024-01-10The chromatin landscape of healthy and injured cell types in the human kidneyGisch DL, Brennan M, Lake BB, Basta J, Keller MS, Melo Ferreira R, Akilesh S, Ghag R, Lu C, Cheng YH, Collins KS, Parikh SV, Rovin BH, Robbins L, Stout L, Conklin KY, Diep D, Zhang B, Knoten A, Barwinska D, Asghari M, Sabo AR, Ferkowicz MJ, Sutton TA, Kelly KJ, De Boer IH, Rosas SE, Kiryluk K, Hodgin JB, Alakwaa F, Winfree S, Jefferson N, Türkmen A, Gaut JP, Gehlenborg N, Phillips CL, El-Achkar TM, Dagher PC, Hato T, Zhang K, Himmelfarb J, Kretzler M, Mollah S; Kidney Precision Medicine Project (KPMP); Jain S, Rauchman M, Eadon MTTMC-WUSTLThere is a need to define regions of gene activation or repression that control human kidney cells in states of health, injury, and repair to understand the molecular pathogenesis of kidney disease and design therapeutic strategies. Comprehensive integration of gene expression with epigenetic features that define regulatory elements remains a significant challenge. We measure dual single nucleus RNA expression and chromatin accessibility, DNA methylation, and H3K27ac, H3K4me1, H3K4me3, and H3K27me3 histone modifications to decipher the chromatin landscape and gene regulation of the kidney in reference and adaptive injury states. We establish a spatially-anchored epigenomic atlas to define the kidney's active, silent, and regulatory accessible chromatin regions across the genome. Using this atlas, we note distinct control of adaptive injury in different epithelial cell types. A proximal tubule cell transcription factor network of ELF3, KLF6, and KLF10 regulates the transition between health and injury, while in thick ascending limb cells this transition is regulated by NR2F1. Further, combined perturbation of ELF3, KLF6, and KLF10 distinguishes two adaptive proximal tubular cell subtypes, one of which manifested a repair trajectory after knockout. This atlas will serve as a foundation to facilitate targeted cell-specific therapeutics by reprogramming gene regulatory networks.
2024-01-25Gene panel selection for targeted spatial transcriptomics.Zhang Y, Petukhov V, Biederstedt E, Que R, Zhang K, Kharchenko PVTMC-UCSDTargeted spatial transcriptomics hold particular promise in analyzing complex tissues. Most such methods, however, measure only a limited panel of transcripts, which need to be selected in advance to inform on the cell types or processes being studied. A limitation of existing gene selection methods is their reliance on scRNA-seq data, ignoring platform effects between technologies. Here we describe gpsFISH, a computational method performing gene selection through optimizing detection of known cell types. By modeling and adjusting for platform effects, gpsFISH outperforms other methods. Furthermore, gpsFISH can incorporate cell type hierarchies and custom gene preferences to accommodate diverse design requirements.
2024-02-01Expanding the coverage of spatial proteomics: a machine learning approachSun H, Li J, Murphy RFHIVE TC-CMUMotivation: Multiplexed protein imaging methods use a chosen set of markers and provide valuable information about complex tissue structure and cellular heterogeneity. However, the number of markers that can be measured in the same tissue sample is inherently limited. Results: In this paper, we present an efficient method to choose a minimal predictive subset of markers that for the first time allows the prediction of full images for a much larger set of markers. We demonstrate that our approach also outperforms previous methods for predicting cell-level protein composition. Most importantly, we demonstrate that our approach can be used to select a marker set that enables prediction of a much larger set than could be measured concurrently. Availability and implementation: All code and intermediate results are available in a Reproducible Research Archive at https://github.com/murphygroup/CODEXPanelOptimization.
2024-02-02Parsing 20 Years of Public Data by AI Maps Trends in Proteomics and Forecasts TechnologyGreen JJ, Grimm C, Fristo A, Byrum J, Kelleher NLRTI-NorthwesternThe trends of the last 20 years in biotechnology were revealed using artificial intelligence and natural language processing (NLP) of publicly available data. Implementing this "science-of-science" approach, we capture convergent trends in the field of proteomics in both technology development and application across the phylogenetic tree of life. With major gaps in our knowledge about protein composition, structure, and location over time, we report trends in persistent, popular approaches and emerging technologies across 94 ideas from a corpus of 29 journals in PubMed over two decades. New metrics for clusters of these ideas reveal the progression and popularity of emerging approaches like single-cell, spatial, compositional, and chemical proteomics designed to better capture protein-level chemistry and biology. This analysis of the proteomics literature with advanced analytic tools quantifies the Rate of Rise for a next generation of technologies to better define, quantify, and visualize the multiple dimensions of the proteome that will transform our ability to measure and understand proteins in the coming decade.
2024-02-13Modelling post-implantation human development to yolk sac blood emergenceHislop J, Song Q, Keshavarz F K, Alavi A, Schoenberger R, LeGraw R, Velazquez JJ, Mokhtari T, Taheri MN, Rytel M, Chuva de Sousa Lopes SM, Watkins S, Stolz D, Kiani S, Sozen B, Bar-Joseph Z, Ebrahimkhani MRHIVE TC-CMUImplantation of the human embryo begins a critical developmental stage that comprises profound events including axis formation, gastrulation and the emergence of haematopoietic system1,2. Our mechanistic knowledge of this window of human life remains limited due to restricted access to in vivo samples for both technical and ethical reasons3-5. Stem cell models of human embryo have emerged to help unlock the mysteries of this stage6-16. Here we present a genetically inducible stem cell-derived embryoid model of early post-implantation human embryogenesis that captures the reciprocal codevelopment of embryonic tissue and the extra-embryonic endoderm and mesoderm niche with early haematopoiesis. This model is produced from induced pluripotent stem cells and shows unanticipated self-organizing cellular programmes similar to those that occur in embryogenesis, including the formation of amniotic cavity and bilaminar disc morphologies as well as the generation of an anterior hypoblast pole and posterior domain. The extra-embryonic layer in these embryoids lacks trophoblast and shows advanced multilineage yolk sac tissue-like morphogenesis that harbours a process similar to distinct waves of haematopoiesis, including the emergence of erythroid-, megakaryocyte-, myeloid- and lymphoid-like cells. This model presents an easy-to-use, high-throughput, reproducible and scalable platform to probe multifaceted aspects of human development and blood formation at the early post-implantation stage. It will provide a tractable human-based model for drug testing and disease modelling.
2024-02-20Stabilized mosaic single-cell data integration using unshared featuresGhazanfar S, Guibentif C, Marioni JCHIVE MC-NYGCCurrently available single-cell omics technologies capture many unique features with different biological information content. Data integration aims to place cells, captured with different technologies, onto a common embedding to facilitate downstream analytical tasks. Current horizontal data integration techniques use a set of common features, thereby ignoring non-overlapping features and losing information. Here we introduce StabMap, a mosaic data integration technique that stabilizes mapping of single-cell data by exploiting the non-overlapping features. StabMap first infers a mosaic data topology based on shared features, then projects all cells onto supervised or unsupervised reference coordinates by traversing shortest paths along the topology. We show that StabMap performs well in various simulation contexts, facilitates 'multi-hop' mosaic data integration where some datasets do not share any features and enables the use of spatial gene expression features for mapping dissociated single-cell data onto a spatial transcriptomic reference.
2024-02-20Thermal-plex: fluidic-free, rapid sequential multiplexed imaging with DNA-encoded thermal channelsHong F, Kishi JY, Delgado RN, Jeong J, Saka SK, Su H, Cepko CL, Yin PTTD-HarvardMultiplexed fluorescence imaging is typically limited to three- to five-plex on standard setups. Sequential imaging methods based on iterative labeling and imaging enable practical higher multiplexing, but generally require a complex fluidic setup with several rounds of slow buffer exchange (tens of minutes to an hour for each exchange step). We report the thermal-plex method, which removes complex and slow buffer exchange steps and provides fluidic-free, rapid sequential imaging. Thermal-plex uses simple DNA probes that are engineered to fluoresce sequentially when, and only when, activated with transient exposure to heating spikes at designated temperatures (thermal channels). Channel switching is fast (<30 s) and is achieved with a commercially available and affordable on-scope heating device. We demonstrate 15-plex RNA imaging (five thermal × three fluorescence channels) in fixed cells and retina tissues in less than 4 min, without using buffer exchange or fluidics. Thermal-plex introduces a new labeling method for efficient sequential multiplexed imaging.
2024-02-20Unsupervised and supervised discovery of tissue cellular neighborhoods from cell phenotypesHu Y, Rong J, Xu Y, Xie R, Peng J, Gao L, Tan KTMC-CHOPIt is poorly understood how different cells in a tissue organize themselves to support tissue functions. We describe the CytoCommunity algorithm for the identification of tissue cellular neighborhoods (TCNs) based on cell phenotypes and their spatial distributions. CytoCommunity learns a mapping directly from the cell phenotype space to the TCN space using a graph neural network model without intermediate clustering of cell embeddings. By leveraging graph pooling, CytoCommunity enables de novo identification of condition-specific and predictive TCNs under the supervision of sample labels. Using several types of spatial omics data, we demonstrate that CytoCommunity can identify TCNs of variable sizes with substantial improvement over existing methods. By analyzing risk-stratified colorectal and breast cancer data, CytoCommunity revealed new granulocyte-enriched and cancer-associated fibroblast-enriched TCNs specific to high-risk tumors and altered interactions between neoplastic and immune or stromal cells within and between TCNs. CytoCommunity can perform unsupervised and supervised analyses of spatial omics maps and enable the discovery of condition-specific cell-cell communication patterns across spatial scales.
2024-02-21Multi-molecular hyperspectral PRM-SRS microscopyZhang W, Li Y, Fung AA, Li Z, Jang H, Zha H, Chen X, Gao F, Wu JY, Sheng H, Yao J, Skowronska-Krawczyk D, Jain S, Shi L.TMC-WUSTLLipids play crucial roles in many biological processes. Mapping spatial distributions and examining the metabolic dynamics of different lipid subtypes in cells and tissues are critical to better understanding their roles in aging and diseases. Commonly used imaging methods (such as mass spectrometry-based, fluorescence labeling, conventional optical imaging) can disrupt the native environment of cells/tissues, have limited spatial or spectral resolution, or cannot distinguish different lipid subtypes. Here we present a hyperspectral imaging platform that integrates a Penalized Reference Matching algorithm with Stimulated Raman Scattering (PRM-SRS) microscopy. Using this platform, we visualize and identify high density lipoprotein particles in human kidney, a high cholesterol to phosphatidylethanolamine ratio inside granule cells of mouse hippocampus, and subcellular distributions of sphingosine and cardiolipin in human brain. Our PRM-SRS displays unique advantages of enhanced chemical specificity, subcellular resolution, and fast data processing in distinguishing lipid subtypes in different organs and species.
2024-03-04Proteome-scale tissue mapping using mass spectrometry based on label-free and multiplexed workflowsKwon Y, Woo J, Yu F, Williams SM, Markillie LM, Moore RJ, Nakayasu ES, Chen J, Campbell-Thompson M, Mathews CE, Nesvizhskii AI, Qia WJ, Zhu YTMC-PNNLMultiplexed bimolecular profiling of tissue microenvironment, or spatial omics, can provide deep insight into cellular compositions and interactions in both normal and diseased tissues. Proteome-scale tissue mapping, which aims to unbiasedly visualize all the proteins in whole tissue section or region of interest, has attracted significant interest because it holds great potential to directly reveal diagnostic biomarkers and therapeutic targets. While many approaches are available, however, proteome mapping still exhibits significant technical challenges in both protein coverage and analytical throughput. Since many of these existing challenges are associated with mass spectrometry-based protein identification and quantification, we performed a detailed benchmarking study of three protein quantification methods for spatial proteome mapping, including label-free, TMT-MS2, and TMT-MS3. Our study indicates label-free method provided the deepest coverages of ~3500 proteins at a spatial resolution of 50 μm and the largest quantification dynamic range, while TMT-MS2 method holds great benefit in mapping throughput at >125 pixels per day. The evaluation also indicates both label-free and TMT-MS2 provide robust protein quantifications in terms of identifying differentially abundant proteins and spatially co-variable clusters. In the study of pancreatic islet microenvironment, we demonstrated deep proteome mapping not only enables to identify protein markers specific to different cell types, but more importantly, it also reveals unknown or hidden protein patterns by spatial co-expression analysis.
2024-03-05A Panoramic View of Cell Population Dynamics in Mammalian AgingZhang Z, Schaefer C, Jiang W, Lu Z, Lee J, Sziraki A, Abdulraouf A, Wick B, Haeussler M, Li Z, Molla G, Satija R, Zhou W, Cao JHIVE MC-NYGCTo elucidate the aging-associated cellular population dynamics throughout the body, here we present PanSci, a single-cell transcriptome atlas profiling over 20 million cells from 623 mouse tissue samples, encompassing a range of organs across different life stages, sexes, and genotypes. This comprehensive dataset allowed us to identify more than 3,000 unique cellular states and catalog over 200 distinct aging-associated cell populations experiencing significant depletion or expansion. Our panoramic analysis uncovered temporally structured, organ- and lineage-specific shifts of cellular dynamics during lifespan progression. Moreover, we investigated aging-associated alterations in immune cell populations, revealing both widespread shifts and organ-specific changes. We further explored the regulatory roles of the immune system on aging and pinpointed specific age-related cell population expansions that are lymphocyte-dependent. The breadth and depth of our 'cell-omics' methodology not only enhance our comprehension of cellular aging but also lay the groundwork for exploring the complex regulatory networks among varied cell types in the context of aging and aging-associated diseases.
2024-03-08Predicting drug outcome of population via clinical knowledge graphBrbić M, Yasunaga M, Agarwal P, Leskovec JTMC-StanfordOptimal treatments depend on numerous factors such as drug chemical properties, disease biology, and patient characteristics to which the treatment is applied. To realize the promise of AI in healthcare, there is a need for designing systems that can capture patient heterogeneity and relevant biomedical knowledge. Here we present PlaNet, a geometric deep learning framework that reasons over population variability, disease biology, and drug chemistry by representing knowledge in the form of a massive clinical knowledge graph that can be enhanced by language models. Our framework is applicable to any sub-population, any drug as well drug combinations, any disease, and to a wide range of pharmacological tasks. We apply the PlaNet framework to reason about outcomes of clinical trials: PlaNet predicts drug efficacy and adverse events, even for experimental drugs and their combinations that have never been seen by the model. Furthermore, PlaNet can estimate the effect of changing population on the trial outcome with direct implications on patient stratification in clinical trials. PlaNet takes fundamental steps towards AI-guided clinical trials design, offering valuable guidance for realizing the vision of precision medicine using AI.
2024-03-20Mapping human tissues with highly multiplexed RNA in situ hybridizationKalhor K, Chen CJ, Lee HS, Cai M, Nafisi M, Que R, Palmer CR, Yuan Y, Zhang Y, Li X, Song J, Knoten A, Lake BB, Gaut JP, Keene CD, Lein E, Kharchenko PV, Chun J, Jain S, Fan JB, Zhang KTMC-UCSDIn situ transcriptomic techniques promise a holistic view of tissue organization and cell-cell interactions. There has been a surge of multiplexed RNA in situ mapping techniques but their application to human tissues has been limited due to their large size, general lower tissue quality and high autofluorescence. Here we report DART-FISH, a padlock probe-based technology capable of profiling hundreds to thousands of genes in centimeter-sized human tissue sections. We introduce an omni-cell type cytoplasmic stain that substantially improves the segmentation of cell bodies. Our enzyme-free isothermal decoding procedure allows us to image 121 genes in large sections from the human neocortex in <10 h. We successfully recapitulated the cytoarchitecture of 20 neuronal and non-neuronal subclasses. We further performed in situ mapping of 300 genes on a diseased human kidney, profiled >20 healthy and pathological cell states, and identified diseased niches enriched in transcriptionally altered epithelial cells and myofibroblasts.
2024-03-25Piezo1 regulates meningeal lymphatic vessel drainage and alleviates excessive CSF accumulationChoi D, Park E, Choi J, Lu R, Yu JS, Kim C, Zhao L, Yu J, Nakashima B, Lee S, Singhal D, Scallan JP, Zhou B, Koh CJ, Lee E, Hong YKTMC-BIDMCPiezo1 regulates multiple aspects of the vascular system by converting mechanical signals generated by fluid flow into biological processes. Here, we find that Piezo1 is necessary for the proper development and function of meningeal lymphatic vessels and that activating Piezo1 through transgenic overexpression or treatment with the chemical agonist Yoda1 is sufficient to increase cerebrospinal fluid (CSF) outflow by improving lymphatic absorption and transport. The abnormal accumulation of CSF, which often leads to hydrocephalus and ventriculomegaly, currently lacks effective treatments. We discovered that meningeal lymphatics in mouse models of Down syndrome were incompletely developed and abnormally formed. Selective overexpression of Piezo1 in lymphatics or systemic administration of Yoda1 in mice with hydrocephalus or Down syndrome resulted in a notable decrease in pathological CSF accumulation, ventricular enlargement and other associated disease symptoms. Together, our study highlights the importance of Piezo1-mediated lymphatic mechanotransduction in maintaining brain fluid drainage and identifies Piezo1 as a promising therapeutic target for treating excessive CSF accumulation and ventricular enlargement.
2024-04-01The Role of Endothelial Cells in Atherosclerosis: Insights from Genetic Association StudiesPepin ME, Gupta RDP-HarvardEndothelial cells (ECs) mediate several biological functions that are relevant to atherosclerosis and coronary artery disease (CAD), regulating an array of vital processes including vascular tone, wound healing, reactive oxygen species, shear stress response, and inflammation. Although it is not yet known which of these functions is linked causally with CAD development and/or progression, genome-wide association studies have implicated more than 400 loci associated with CAD risk, among which several have shown EC-relevant functions. Given the arduous process of mechanistically interrogating single loci to CAD, high-throughput variant characterization methods, including pooled Clustered Regularly Interspaced Short Palindromic Repeats screens, offer exciting potential to rapidly accelerate the discovery of bona fide EC-relevant genetic loci. These discoveries in turn will broaden the therapeutic avenues for CAD beyond lipid lowering and behavioral risk modification to include EC-centric modalities of risk prevention and treatment.
2024-04-02Imaging Mass Spectrometry of Isotopically Resolved Intact Proteins on a Trapped Ion-Mobility Quadrupole Time-of-Flight Mass SpectrometerKlein DR, Rivera ES, Caprioli RM, Spraggins JMTMC-Vanderbilt (Kidney)In this work, we demonstrate rapid, high spatial, and high spectral resolution imaging of intact proteins by matrix-assisted laser desorption/ionization (MALDI) imaging mass spectrometry (IMS) on a hybrid quadrupole-reflectron time-of-flight (qTOF) mass spectrometer equipped with trapped ion mobility spectrometry (TIMS). Historically, untargeted MALDI IMS of proteins has been performed on TOF mass spectrometers. While advances in TOF instrumentation have enabled rapid, high spatial resolution IMS of intact proteins, TOF mass spectrometers generate relatively low-resolution mass spectra with limited mass accuracy. Conversely, the implementation of MALDI sources on high-resolving power Fourier transform (FT) mass spectrometers has allowed IMS experiments to be conducted with high spectral resolution with the caveat of increasingly long data acquisition times. As illustrated here, qTOF mass spectrometers enable protein imaging with the combined advantages of TOF and FT mass spectrometers. Protein isotope distributions were resolved for both a protein standard mixture and proteins detected from a whole-body mouse pup tissue section. Rapid (∼10 pixels/s) 10 μm lateral spatial resolution IMS was performed on a rat brain tissue section while maintaining isotopic spectral resolution. Lastly, proof-of-concept MALDI-TIMS data was acquired from a protein mixture to demonstrate the ability to differentiate charge states by ion mobility. These experiments highlight the advantages of qTOF and timsTOF platforms for resolving and interpreting complex protein spectra generated from tissue by IMS.
2024-04-11An open source knowledge graph ecosystem for the life sciencesCallahan TJ, Tripodi IJ, Stefanski AL, Cappelletti L, Taneja SB, Wyrwa JM, Casiraghi E, Matentzoglu NA, Reese J, Silverstein JC, Hoyt CT, Boyce RD, Malec SA, Unni DR, Joachimiak MP, Robinson PN, Mungall CJ, Cavalleri E, Fontana T, Valentini G, Mesiti M, Gillenwater LA, Santangelo B, Vasilevsky NA, Hoehndorf R, Bennett TD, Ryan PB, Hripcsak G, Kahn MG, Bada M, Baumgartner WA Jr, Hunter LEHIVE IEC-PSCTranslational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.