Pubs – HuBMAP Consortium

HuBMAP Publications

There are 379 publications.

Publish Date

Title

Abstract

Author(s)

HuBMAP Component

0023-06-01

An integrated cell atlas of the lung in health and disease

Sikkema L, Ramírez-Suástegui C, Strobl DC, Gillett TE, Zappia L, Madissoon E, Markov NS, Zaragosi LE, Ji Y, Ansari M, Arguel MJ, Apperloo L, Banchero M, Bécavin C, Berg M, Chichelnitskiy E, Chung MI, Collin A, Gay ACA, Gote-Schniering J, Hooshiar Kashani B, Inecik K, Jain M, Kapellos TS, Kole TM, Leroy S, Mayr CH, Oliver AJ, von Papen M, Peter L, Taylor CJ, Walzthoeni T, Xu C, Bui LT, De Donno C, Dony L, Faiz A, Guo M, Gutierrez AJ, Heumos L, Huang N, Ibarra IL, Jackson ND, Kadur Lakshminarasimha Murthy P, Lotfollahi M, Tabib T, Talavera-López C, Travaglini KJ, Wilbrey-Clark A, Worlock KB, Yoshida M; Lung Biological Network Consortium; van den Berge M, Bossé Y, Desai TJ, Eickelberg O, Kaminski N, Krasnow MA, Lafyatis R, Nikolic MZ, Powell JE, Rajagopal J, Rojas M, Rozenblatt-Rosen O, Seibold MA, Sheppard D, Shepherd DP, Sin DD, Timens W, Tsankov AM, Whitsett J, Xu Y, Banovich NE, Barbry P, Duong TE, Falk CS, Meyer KB, Kropski JA, Pe'er D, Schiller HB, Tata PR, Schultze JL, Teichmann SA, Misharin AV, Nawijn MC

TMC-URMC

Single-cell technologies have transformed our understanding of human tissues. Yet, studies typically capture only a limited number of donors and disagree on cell type definitions. Integrating many single-cell datasets can address these limitations of individual studies and capture the variability present in the population. Here we present the integrated Human Lung Cell Atlas (HLCA), combining 49 datasets of the human respiratory system into a single atlas spanning over 2.4 million cells from 486 individuals. The HLCA presents a consensus cell type re-annotation with matching marker genes, including annotations of rare and previously undescribed cell types. Leveraging the number and diversity of individuals in the HLCA, we identify gene modules that are associated with demographic covariates such as age, sex and body mass index, as well as gene modules changing expression along the proximal-to-distal axis of the bronchial tree. Mapping new data to the HLCA enables rapid data annotation and interpretation. Using the HLCA as a reference for the study of disease, we identify shared cell states across multiple lung diseases, including SPP1⁺ profibrotic monocyte-derived macrophages in COVID-19, pulmonary fibrosis and lung carcinoma. Overall, the HLCA serves as an example for the development and use of large-scale, cross-dataset organ atlases within the Human Cell Atlas.

2018-10-29

Identification of spatially associated subpopulations by combining scRNAseq and sequential fluorescence in situ hybridization data

Zhu Q, Shah S, Dries R, Cai L, Yuan GC.

TTD-Cal Tech

How intrinsic gene-regulatory networks interact with a cell's spatial environment to define its identity remains poorly understood. We developed an approach to distinguish between intrinsic and extrinsic effects on global gene expression by integrating analysis of sequencing-based and imaging-based single-cell transcriptomic profiles, using cross-platform cell type mapping combined with a hidden Markov random field model. We applied this approach to dissect the cell-type- and spatial-domain-associated heterogeneity in the mouse visual cortex region. Our analysis identified distinct spatially associated, cell-type-independent signatures in the glutamatergic and astrocyte cell compartments. Using these signatures to analyze single-cell RNA sequencing data, we identified previously unknown spatially associated subpopulations, which were validated by comparison with anatomical structures and Allen Brain Atlas images.

2018-12-11

Forecasting innovations in science, technology, and education

Börner K, Rouse WB, Trunfio P, Stanley HE.

HIVE MC-IU

Human survival depends on our ability to predict future outcomes so that we can make informed decisions. Human cognition and perception are optimized for local, short-term decision-making, such as deciding when to fight or flight, whom to mate, or what to eat. For more elaborate decisions (e.g., when to harvest, when to go to war or not, and whom to marry), people used to consult oracles—prophetic predictions of the future inspired by the gods. Over time, oracles were replaced by models of the structure and dynamics of natural, technological, and social systems. In the 21st century, computational models and visualizations of model results inform much of our decision-making: near real-time weather forecasts help us decide when to take an umbrella, plant, or harvest; where to ground airplanes; or when to evacuate inhabitants in the path of a hurricane, tornado, or flood. Long-term weather and climate forecasts predict a future with increasing torrential rains, stronger winds, and more frequent drought, landslides, and forest fires as well as rising sea levels, enabling decision makers to prepare for these changes by building dikes, moving cities and roads, and building larger water reservoirs and better storm sewers.

2018-12-19

Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics

Stoeckius M, Zheng S, Houck-Loomis B, Hao S, Yeung BZ, Mauck WM, Smibert P, Satija R.

HIVE MC-NYGC

Despite rapid developments in single cell sequencing, sample-specific batch effects, detection of cell multiplets, and experimental costs remain outstanding challenges. Here, we introduce Cell Hashing, where oligo-tagged antibodies against ubiquitously expressed surface proteins uniquely label cells from distinct samples, which can be subsequently pooled. By sequencing these tags alongside the cellular transcriptome, we can assign each cell to its original sample, robustly identify cross-sample multiplets, and “super-load” commercial droplet-based systems for significant cost reduction. We validate our approach using a complementary genetic approach and demonstrate how hashing can generalize the benefits of single cell multiplexing to diverse samples and experimental designs.

2019-02-01

Protein identification strategies in MALDI imaging mass spectrometry: a brief review

Ryan DJ, Spraggins JM, Caprioli RM.

TMC-Vanderbilt (Kidney)

Matrix assisted laser desorption/ionization (MALDI) imaging mass spectrometry (IMS) is a powerful technology used to investigate the spatial distributions of thousands of molecules throughout a tissue section from a single experiment. As proteins represent an important group of functional molecules in tissue and cells, the imaging of proteins has been an important point of focus in the development of IMS technologies and methods. Protein identification is crucial for the biological contextualization of molecular imaging data. However, gas-phase fragmentation efficiency of MALDI generated proteins presents significant challenges, making protein identification directly from tissue difficult. This review highlights methods and technologies specifically related to protein identification that have been developed to overcome these challenges in MALDI IMS experiments.

2019-02-05

Data visualization literacy: Definitions, conceptual frameworks, exercises, and assessments

Börner K, Bueckle A and Ginda M.

HIVE MC-IU

In the information age, the ability to read and construct data visualizations becomes as important as the ability to read and write text. However, while standard definitions and theoretical frameworks to teach and assess textual, mathematical, and visual literacy exist, current data visualization literacy (DVL) definitions and frameworks are not comprehensive enough to guide the design of DVL teaching and assessment. This paper introduces a data visualization literacy framework (DVL-FW) that was specifically developed to define, teach, and assess DVL. The holistic DVL-FW promotes both the reading and construction of data visualizations, a pairing analogous to that of both reading and writing in textual literacy and understanding and applying in mathematical literacy. Specifically, the DVL-FW defines a hierarchical typology of core concepts and details the process steps that are required to extract insights from data. Advancing the state of the art, the DVL-FW interlinks theoretical and procedural knowledge and showcases how both can be combined to design curricula and assessment measures for DVL. Earlier versions of the DVL-FW have been used to teach DVL to more than 8,500 residential and online students, and results from this effort have helped revise and validate the DVL-FW presented here.

2019-02-15

Dhaka: Variational Autoencoder for Unmasking Tumor Heterogeneity from Single Cell Genomic Data

Rashid S, Shah S, Bar-Joseph Z, Pandya R.

HIVE TC-CMU

MOTIVATION: Intra-tumor heterogeneity is one of the key confounding factors in deciphering tumor evolution. Malignant cells exhibit variations in their gene expression, copy numbers, and mutation even when originating from a single progenitor cell. Single cell sequencing of tumor cells has recently emerged as a viable option for unmasking the underlying tumor heterogeneity. However, extracting features from single cell genomic data in order to infer their evolutionary trajectory remains computationally challenging due to the extremely noisy and sparse nature of the data. RESULTS: Here we describe 'Dhaka', a variational autoencoder method which transforms single cell genomic data to a reduced dimension feature space that is more efficient in differentiating between (hidden) tumor subpopulations. Our method is general and can be applied to several different types of genomic data including copy number variation from scDNA-Seq and gene expression from scRNA-Seq experiments. We tested the method on synthetic and 6 single cell cancer datasets where the number of cells ranges from 250 to 6000 for each sample. Analysis of the resulting feature space revealed subpopulations of cells and their marker genes. The features are also able to infer the lineage and/or differentiation trajectory between cells greatly improving upon prior methods suggested for feature extraction and dimensionality reduction of such data. AVAILABILITY AND IMPLEMENTATION: All the datasets used in the paper are publicly available and developed software package and supporting info is available on Github https://github.com/MicrosoftGenomics/Dhaka.

2019-02-20

The single-cell transcriptional landscape of mammalian organogenesis

Cao J, Spielmann M, Qiu X, Huang X, Ibrahim DM, Hill AJ, Zhang F, Mundlos S, Christiansen L, Steemers FJ, Trapnell C & Shendure J

TMC-Cal Tech

Mammalian organogenesis is a remarkable process. Within a short timeframe, the cells of the three germ layers transform into an embryo that includes most of the major internal and external organs. Here we investigate the transcriptional dynamics of mouse organogenesis at single-cell resolution. Using single-cell combinatorial indexing, we profiled the transcriptomes of around 2 million cells derived from 61 embryos staged between 9.5 and 13.5 days of gestation, in a single experiment. The resulting ‘mouse organogenesis cell atlas’ (MOCA) provides a global view of developmental processes during this critical window. We use Monocle 3 to identify hundreds of cell types and 56 trajectories, many of which are detected only because of the depth of cellular coverage, and collectively define thousands of corresponding marker genes. We explore the dynamics of gene expression within cell types and trajectories over time, including focused analyses of the apical ectodermal ridge, limb mesenchyme and skeletal muscle.

2019-03-01

Multiple TOF/TOF Events in a Single Laser Shot for Multiplexed Lipid Identifications in MALDI Imaging Mass Spectrometry

Prentice BM, McMillen JC, Caprioli RM

TMC-Vanderbilt (Kidney)

Tandem mass spectrometry (MS/MS) is often used to identify lipids in matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) workflows. The molecular specificity afforded by MS/MS is crucial on MALDI time-of-flight (TOF) platforms that generally lack high resolution accurate mass measurement capabilities. Unfortunately, imaging MS/MS workflows generally only monitor a single precursor ion over the imaged area, limiting the throughput of this methodology. Herein, we demonstrate that multiple TOF/TOF events performed in each laser shot can be used to improve the throughput of imaging MS/MS. This is shown to enable the simultaneous identification of multiple phosphatidylcholine lipids in rat brain tissue. Uniquely, the separation in time achieved for the precursor ions in the TOF-1 region of the instrument is maintained for the fragment ions as they are analyzed in TOF-2, allowing for the differentiation of fragment ions of the exact same m/z derived from different precursor ions (e.g., the m/z 163 fragment ion from precursor ion m/z 772.5 is easily distinguished from the m/z 163 fragment ion from precursor ion m/z 826.5). This multiplexed imaging MS/MS approach allows for the acquisition of complete fragment ion spectra for multiple precursor ions per laser shot.

2019-03-25

Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH

Eng CL, Lawson M, Zhu Q, Dries R, Koulena N, Takei Y, Yun J, Cronin C, Karp C, Yuan GC, Cai L.

TMC-Cal Tech

Imaging the transcriptome in situ with high accuracy has been a major challenge in single-cell biology, which is particularly hindered by the limits of optical resolution and the density of transcripts in single cells. Here we demonstrate an evolution of sequential fluorescence in situ hybridization (seqFISH+). We show that seqFISH+ can image mRNAs for 10,000 genes in single cells-with high accuracy and sub-diffraction-limit resolution-in the cortex, subventricular zone and olfactory bulb of mouse brain, using a standard confocal microscope. The transcriptome-level profiling of seqFISH+ allows unbiased identification of cell classes and their spatial organization in tissues. In addition, seqFISH+ reveals subcellular mRNA localization patterns in cells and ligand-receptor pairs across neighbouring cells. This technology demonstrates the ability to generate spatial cell atlases and to perform discovery-driven studies of biological processes in situ.

2019-03-29

Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution

Rodriques SG, Stickels RR, Goeva A, Martin CA, Murray E, Vanderburg CR, Welch J, Chen LM, Chen F, Macosko EZ

RTI-Broad

Spatial positions of cells in tissues strongly influence function, yet a high-throughput, genome-wide readout of gene expression with cellular resolution is lacking. We developed Slide-seq, a method for transferring RNA from tissue sections onto a surface covered in DNA-barcoded beads with known positions, allowing the locations of the RNA to be inferred by sequencing. Using Slide-seq, we localized cell types identified by single-cell RNA sequencing datasets within the cerebellum and hippocampus, characterized spatial gene expression patterns in the Purkinje layer of mouse cerebellum, and defined the temporal evolution of cell type-specific responses in a mouse model of traumatic brain injury. These studies highlight how Slide-seq provides a scalable method for obtaining spatially resolved gene expression data at resolutions comparable to the sizes of individual cells.

2019-04-06

Imaging mass spectrometry enables molecular profiling of mouse and human pancreatic tissue

Prentice BM, Hart NJ, Phillips N, Haliyur R, Judd A, Armandala R, Spraggins JM, Lowe CL, Boyd KL, Stein RW, Wright CV, Norris JL, Powers AC, Brissova M, Caprioli RM.

TMC-Vanderbilt (Kidney)

The molecular response and function of pancreatic islet cells during metabolic stress is a complex process. The anatomical location and small size of pancreatic islets coupled with current methodological limitations have prevented the achievement of a complete, coherent picture of the role that lipids and proteins play in cellular processes under normal conditions and in diseased states. Herein, we describe the development of untargeted tissue imaging mass spectrometry (IMS) technologies for the study of in situ protein and, more specifically, lipid distributions in murine and human pancreases.

2019-04-19

The Importance of Clinical Tissue Imaging

Spraggins JM, Schwamborn K, Heeren RMA, Eberlin LS.

TMC-Vanderbilt (Kidney)

Tissue imaging by mass spectrometry (MS) combines the sensitivity and molecular specificity of MS with the spatial fidelity of classical histology for analysis of metabolites, lipids and proteins in tissues (Fig. 1). MS-based imaging is label-free, untargeted, sensitive, and specific, thereby enabling application in both basic biomedical research and the clinical laboratory. While all tissue imaging experiments are conceptually similar in their ability to generate spatial molecular data; ionization, data collection, and purpose vary widely. Here, we highlight recent technical advances and efforts that are motivating translational applications of this emerging technology.

2019-05-01

SABER amplifies FISH: enhanced multiplexed imaging of RNA and DNA in cells and tissues

Kishi JY, Lapan SW, Beliveau BJ, West ER, Zhu A, Sasaki HM, Saka SK, Wang Y, Cepko CL, Yin P.

TTD-Harvard

Fluorescence in situ hybridization (FISH) reveals the abundance and positioning of nucleic acid sequences in fixed samples. Despite recent advances in multiplexed amplification of FISH signals, it remains challenging to achieve high levels of simultaneous amplification and sequential detection with high sampling efficiency and simple workflows. Here we introduce signal amplification by exchange reaction (SABER), which endows oligonucleotide-based FISH probes with long, single-stranded DNA concatemers that aggregate a multitude of short complementary fluorescent imager strands. We show that SABER amplified RNA and DNA FISH signals (5- to 450-fold) in fixed cells and tissues. We also applied 17 orthogonal amplifiers against chromosomal targets simultaneously and detected mRNAs with high efficiency. We then used 10-plex SABER-FISH to identify in vivo introduced enhancers with cell-type-specific activity in the mouse retina. SABER represents a simple and versatile molecular toolkit for rapid and cost-effective multiplexed imaging of nucleic acid targets.

2019-05-06

Visualizing learner engagement, performance, and trajectories to evaluate and optimize online course design

Ginda M, Richey MC, Cousino M, Börner K.

HIVE MC-IU

Learning analytics and visualizations make it possible to examine and communicate learners’ engagement, performance, and trajectories in online courses to evaluate and optimize course design for learners. This is particularly valuable for workforce training involving employees who need to acquire new knowledge in the most effective manner. This paper introduces a set of metrics and visualizations that aim to capture key dynamical aspects of learner engagement, performance, and course trajectories. The metrics are applied to identify prototypical behavior and learning pathways through and interactions with course content, activities, and assessments. The approach is exemplified and empirically validated using more than 30 million separate logged events that capture activities of 1,608 Boeing engineers taking the MITxPro Course, “Architecture of Complex Systems,” delivered in Fall 2016. Visualization results show course structure and patterns of learner interactions with course material, activities, and assessments. Tree visualizations are used to represent course hierarchical structures and explicit sequence of content modules. Learner trajectory networks represent pathways and interactions of individual learners through course modules, revealing patterns of learner engagement, content access strategies, and performance. Results provide evidence for instructors and course designers for evaluating the usage and effectiveness of course materials and intervention strategies.

2019-06-04

Cell lineage inference from SNP and scRNA-Seq data

Ding J, Lin C, Bar-Joseph Z.

HIVE TC-CMU

Several recent studies focus on the inference of developmental and response trajectories from single cell RNA-Seq (scRNA-Seq) data. A number of computational methods, often referred to as pseudo-time ordering, have been developed for this task. Recently, CRISPR has also been used to reconstruct lineage trees by inserting random mutations. However, both approaches suffer from drawbacks that limit their use. Here, we develop a method to detect significant, cell type specific, sequence mutations from scRNA-Seq data. We show that only a few mutations are enough for reconstructing good branching models. Integrating these mutations with expression data further improves the accuracy of the reconstructed models. As we show, the majority of mutations we identify are likely RNA editing events indicating that such information can be used to distinguish cell types.

2019-06-06

Comprehensive Integration of Single-Cell Data

Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM 3rd, Hao Y, Stoeckius M, Smibert P, Satija R.

HIVE MC-NYGC

Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to “anchor” diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.

2019-06-07

The human body at cellular resolution: the NIH Human Biomolecular Atlas Program

HuBMAP Consortium

Consortium

Transformative technologies are enabling the construction of three-dimensional maps of tissues with unprecedented spatial and molecular resolution. Over the next seven years, the NIH Common Fund Human Biomolecular Atlas Program (HuBMAP) intends to develop a widely accessible framework for comprehensively mapping the human body at single-cell resolution by supporting technology development, data acquisition, and detailed spatial mapping. HuBMAP will integrate its efforts with other funding agencies, programs, consortia, and the biomedical research community at large towards the shared vision of a comprehensive, accessible three-dimensional molecular and cellular atlas of the human body, in health and under various disease conditions.

2019-06-13

Two Specific Sulfatide Species Are Dysregulated during Renal Development in a Mouse Model of Alport Syndrome

Gessel MM, Spraggins JM, Voziyan PA, Abrahamson DR, Caprioli RM, Hudson BG.

TMC-Vanderbilt (Kidney)

Alport syndrome is caused by mutations in collagen IV that alter the morphology of renal glomerular basement membrane. Mutations result in proteinuria, tubulointerstitial fibrosis, and renal failure but the pathogenic mechanisms are not fully understood. Using imaging mass spectrometry, we aimed to determine whether the spatial and/or temporal patterns of renal lipids are perturbed during the development of Alport syndrome in the mouse model. Our results show that most sulfatides are present at similar levels in both the wild-type (WT) and the Alport kidneys, with the exception of two specific sulfatide species, SulfoHex-Cer(d18:2/24:0) and SulfoHex-Cer(d18:2/16:0). In the Alport but not in WT kidneys, the levels of these species mirror the previously described abnormal laminin expression in Alport syndrome. The presence of these sulfatides in renal tubules but not in glomeruli suggests that this specific aberrant lipid pattern may be related to the development of tubulointerstitial fibrosis in Alport disease.

2019-06-18

MicroLESA: Integrating Autofluorescence Microscopy, In Situ Micro-Digestions, and Liquid Extraction Surface Analysis for High Spatial Resolution Targeted Proteomic Studies.

Ryan DJ, Patterson NH, Putnam NE, Wilde AD, Weiss A, Perry WJ, Cassat JE, Skaar EP, Caprioli RM, Spraggins JM.

TMC-Vanderbilt (Kidney)

The ability to target discrete features within tissue using liquid surface extractions enables the identification of proteins while maintaining the spatial integrity of the sample. Here, we present a liquid extraction surface analysis (LESA) workflow, termed microLESA, that allows proteomic profiling from discrete tissue features of ∼110 μm in diameter by integrating nondestructive autofluorescence microscopy and spatially targeted liquid droplet micro-digestion. Autofluorescence microscopy provides the visualization of tissue foci without the need for chemical stains or the use of serial tissue sections. Tryptic peptides are generated from tissue foci by applying small volume droplets (∼250 pL) of enzyme onto the surface prior to LESA. The microLESA workflow reduced the diameter of the sampled area almost 5-fold compared to previous LESA approaches. Experimental parameters, such as tissue thickness, trypsin concentration, and enzyme incubation duration, were tested to maximize proteomics analysis. The microLESA workflow was applied to the study of fluorescently labeled Staphylococcus aureus infected murine kidney to identify unique proteins related to host defense and bacterial pathogenesis. Proteins related to nutritional immunity and host immune response were identified by performing microLESA at the infectious foci and surrounding abscess. These identifications were then used to annotate specific proteins observed in infected kidney tissue by MALDI FT-ICR IMS through accurate mass matching.

2019-06-19

The 2019 mathematical oncology roadmap

Rockne RC, Hawkins-Daarud A, Swanson KR, Sluka JP, Glazier JA, Macklin P, Hormuth DA, Jarrett AM, Lima EABF, Tinsley Oden J, Biros G, Yankeelov TE, Curtius K, Al Bakir I, Wodarz D, Komarova N, Aparicio L, Bordyuh M, Rabadan R, Finley SD, Enderling H, Caudell J, et al.

HIVE MC-IU

Whether the nom de guerre is Mathematical Oncology, Computational or Systems Biology, Theoretical Biology, Evolutionary Oncology, Bioinformatics, or simply Basic Science, there is no denying that mathematics continues to play an increasingly prominent role in cancer research. Mathematical Oncology—defined here simply as the use of mathematics in cancer research—complements and overlaps with a number of other fields that rely on mathematics as a core methodology. As a result, Mathematical Oncology has a broad scope, ranging from theoretical studies to clinical trials designed with mathematical models. This Roadmap differentiates Mathematical Oncology from related fields and demonstrates specific areas of focus within this unique field of research. The dominant theme of this Roadmap is the personalization of medicine through mathematics, modelling, and simulation. This is achieved through the use of patient-specific clinical data to: develop individualized screening strategies to detect cancer earlier; make predictions of response to therapy; design adaptive, patient-specific treatment plans to overcome therapy resistance; and establish domain-specific standards to share model predictions and to make models and simulations reproducible. The cover art for this Roadmap was chosen as an apt metaphor for the beautiful, strange, and evolving relationship between mathematics and cancer.

2019-06-27

A single-nucleus RNA-sequencing pipeline to decipher the molecular anatomy and pathophysiology of human kidneys

Lake BB, Chen S, Hoshi M, Plongthongkum N, Salamon D, Knoten A, Vijayan A, Venkatesh R, Kim EH, Gao D, Gaut J, Zhang K, Jain S

TMC-UCSD

Defining cellular and molecular identities within the kidney is necessary to understand its organization and function in health and disease. Here we demonstrate a reproducible method with minimal artifacts for single-nucleus Droplet-based RNA sequencing (snDrop-Seq) that we use to resolve thirty distinct cell populations in human adult kidney. We define molecular transition states along more than ten nephron segments spanning two major kidney regions. We further delineate cell type-specific expression of genes associated with chronic kidney disease, diabetes and hypertension, providing insight into possible targeted therapies. This includes expression of a hypertension-associated mechano-sensory ion channel in mesangial cells, and identification of proximal tubule cell populations defined by pathogenic expression signatures. Our fully optimized, quality-controlled transcriptomic profiling pipeline constitutes a tool for the generation of healthy and diseased molecular atlases applicable to clinical samples.

2019-08-01

Judd AM, Gutierrez DB, Moore JL, Patterson NH, Yang J, Romer CE, Norris JL, Caprioli RM

TMC-Vanderbilt (Kidney)

Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) is a molecular imaging technology uniquely capable of untargeted measurement of proteins, lipids, and metabolites while retaining spatial information about their location in situ. This powerful combination of capabilities has the potential to bring a wealth of knowledge to the field of molecular histology. Translation of this innovative research tool into clinical laboratories requires the development of reliable sample preparation protocols for the analysis of proteins from formalin-fixed paraffin-embedded (FFPE) tissues, the standard preservation process in clinical pathology. Although ideal for stained tissue analysis by microscopy, the FFPE process cross-links, disrupts, or can remove proteins from the tissue, making analysis of the protein content challenging. To date, reported approaches differ widely in process and efficacy. This tutorial presents a strategy derived from systematic testing and optimization of key parameters, for reproducible in situ tryptic digestion of proteins in FFPE tissue and subsequent MALDI IMS analysis. The approach describes a generalized method for FFPE tissues originating from virtually any source.

2019-08-19

Immuno-SABER enables highly multiplexed and amplified protein imaging in tissues

Saka SK, Wang Y, Kishi JY, Zhu A, Zeng Y, Xie W, Kirli K, Yapp C, Cicconet M, Beliveau BJ, Lapan SW, Yin S, Lin M, Boyden ES, Kaeser PS, Pihan G, Church GM, Yin P.

TTD-Harvard

Spatial mapping of proteins in tissues is hindered by limitations in multiplexing, sensitivity and throughput. Here we report immunostaining with signal amplification by exchange reaction (Immuno-SABER), which achieves highly multiplexed signal amplification via DNA-barcoded antibodies and orthogonal DNA concatemers generated by primer exchange reaction (PER). SABER offers independently programmable signal amplification without in situ enzymatic reactions, and intrinsic scalability to rapidly amplify and visualize a large number of targets when combined with fast exchange cycles of fluorescent imager strands. We demonstrate 5- to 180-fold signal amplification in diverse samples (cultured cells, cryosections, formalin-fixed paraffin-embedded sections and whole-mount tissues), as well as simultaneous signal amplification for ten different proteins using standard equipment and workflows. We also combined SABER with expansion microscopy to enable rapid, multiplexed super-resolution tissue imaging. Immuno-SABER presents an effective and accessible platform for multiplexed and amplified imaging of proteins with high sensitivity and throughput.

2019-09-02

A pooled single-cell genetic screen identifies regulatory checkpoints in the continuum of the epithelial-to-mesenchymal transition

McFaline-Figueroa JL, Hill AJ, Qiu X, Jackson D, Shendure J, Trapnell C.

TMC-Cal Tech

Integrating single-cell trajectory analysis with pooled genetic screening could reveal the genetic architecture that guides cellular decisions in development and disease. We applied this paradigm to probe the genetic circuitry that controls epithelial-to-mesenchymal transition (EMT). We used single-cell RNA sequencing to profile epithelial cells undergoing a spontaneous spatially determined EMT in the presence or absence of transforming growth factor-β. Pseudospatial trajectory analysis identified continuous waves of gene regulation as opposed to discrete ‘partial’ stages of EMT. KRAS was connected to the exit from the epithelial state and the acquisition of a fully mesenchymal phenotype. A pooled single-cell CRISPR-Cas9 screen identified EMT-associated receptors and transcription factors, including regulators of KRAS, whose loss impeded progress along the EMT. Inhibiting the KRAS effector MEK and its upstream activators EGFR and MET demonstrates that interruption of key signaling events reveals regulatory ‘checkpoints’ in the EMT continuum that mimic discrete stages, and reconciles opposing views of the program that controls EMT.

2019-09-10

Supervised classification enables rapid annotation of cell atlases

Pliner HA, Shendure J, Trapnell C.

TMC-Cal Tech

Single-cell molecular profiling technologies are gaining rapid traction, but the manual process by which resulting cell types are typically annotated is labor intensive and rate-limiting. We describe Garnett, a tool for rapidly annotating cell types in single-cell transcriptional profiling and single-cell chromatin accessibility datasets, based on an interpretable, hierarchical markup language of cell type-specific genes. Garnett successfully classifies cell types in tissue and whole organism datasets, as well as across species.

2019-10-07

GiniClust3: a fast and memory-efficient tool for rare cell type identification

Dong R, Yuan GC.

TTD-Cal Tech

BACKGROUND: With the rapid development of single-cell RNA sequencing technology, it is possible to dissect cell-type composition at high resolution. A number of methods have been developed with the purpose to identify rare cell types. However, existing methods are still not scalable to large datasets, limiting their utility. To overcome this limitation, we present a new software package, called GiniClust3, which is an extension of GiniClust2 and significantly faster and memory-efficient than previous versions. RESULTS: Using GiniClust3, it only takes about 7 h to identify both common and rare cell clusters from a dataset that contains more than one million cells. Cell type mapping and perturbation analyses show that GiniClust3 could robustly identify cell clusters. CONCLUSIONS: Taken together, these results suggest that GiniClust3 is a powerful tool to identify both common and rare cell population and can handle large dataset. GiniCluster3 is implemented in the open-source python package and available at https://github.com/rdong08/GiniClust3.

2019-10-08

High-Performance Molecular Imaging with MALDI Trapped Ion-Mobility Time-of-Flight (timsTOF) Mass Spectrometry

Spraggins JM, Djambazova KV, Rivera ES, Migas LG, Neumann EK, Fuetterer A, Suetering J, Goedecke N, Ly A, Van de Plas R, Caprioli RM.

TMC-Vanderbilt (Kidney)

Understanding the genetic and molecular drivers of phenotypic heterogeneity across individuals is central to biology. As new technologies enable fine-grained and spatially resolved molecular profiling, we need new computational approaches to integrate data from the same organ across different individuals into a consistent reference and to construct maps of molecular and cellular organization at histological and anatomical scales. Here, we review previous efforts and discuss challenges involved in establishing such a common coordinate framework, the underlying map of tissues and organs. We focus on strategies to handle anatomical variation across individuals and highlight the need for new technologies and analytical methods spanning multiple hierarchical scales of spatial resolution.

2019-10-14

High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell

Chen S, Lake BB, Zhang K.

TMC-UCSD

Single-cell RNA sequencing can reveal the transcriptional state of cells, yet provides little insight into the upstream regulatory landscape associated with open or accessible chromatin regions. Joint profiling of accessible chromatin and RNA within the same cells would permit direct matching of transcriptional regulation to its outputs. Here, we describe droplet-based single-nucleus chromatin accessibility and mRNA expression sequencing (SNARE-seq), a method that can link a cell’s transcriptome with its accessible chromatin for sequencing at scale. Specifically, accessible sites are captured by Tn5 transposase in permeabilized nuclei to permit, within many droplets in parallel, DNA barcode tagging together with the mRNA molecules from the same cells. To demonstrate the utility of SNARE-seq, we generated joint profiles of 5,081 and 10,309 cells from neonatal and adult mouse cerebral cortices, respectively. We reconstructed the transcriptome and epigenetic landscapes of major and rare cell types, uncovered lineage-specific accessible sites, especially for low-abundance cells, and connected the dynamics of promoter accessibility with transcription level during neurogenesis.

2019-10-29

Staphylococcus aureus exhibits heterogeneous siderophore production within the vertebrate host

Perry WJ, Spraggins JM, Sheldon JR, Grunenwald CM, Heinrichs DE, Cassat JE, Skaar EP, Caprioli RM

TMC-Vanderbilt (Kidney)

Siderophores, iron-scavenging small molecules, are fundamental to bacterial nutrient metal acquisition and enable pathogens to overcome challenges imposed by nutritional immunity. Multimodal imaging mass spectrometry allows visualization of host-pathogen iron competition, by mapping siderophores within infected tissue. We have observed heterogeneous distributions of Staphylococcus aureus siderophores across infectious foci, challenging the paradigm that the vertebrate host is a uniformly iron-depleted environment to invading microbes.

2019-11-13

High spatial resolution imaging of biological tissues using nanospray desorption electrospray ionization mass spectrometry

Yin R, Burnum-Johnson KE, Sun X, Dey SK & Laskin J

TTD-Purdue

Mass spectrometry imaging (MSI) enables label-free spatial mapping of hundreds of biomolecules in tissue sections. This capability provides valuable information on tissue heterogeneity that is difficult to obtain using population-averaged assays. Despite substantial developments in both instrumentation and methodology, MSI of tissue samples at single-cell resolution remains challenging. Herein, we describe a protocol for robust imaging of tissue sections with a high (better than 10-μm) spatial resolution using nanospray desorption electrospray ionization (nano-DESI) mass spectrometry, an ambient ionization technique that does not require sample pretreatment before analysis. In this protocol, mouse uterine tissue is used as a model system to illustrate both the workflow and data obtained in these experiments. We provide a detailed description of the nano-DESI MSI platform, fabrication of the nano-DESI and shear force probes, shear force microscopy experiments, spectral acquisition, and data processing. A properly trained researcher (e.g., technician, graduate student, or postdoc) can complete all the steps from probe fabrication to data acquisition and processing within a single day. We also describe a new strategy for acquiring both positive- and negative-mode imaging data in the same experiment. This is achieved by alternating between positive and negative data acquisition modes during consecutive line scans. Using our imaging approach, hundreds of high-quality ion images were obtained from a single uterine section. This protocol enables sensitive and quantitative imaging of lipids and metabolites in heterogeneous tissue sections with high spatial resolution, which is critical to understanding biochemical processes occurring in biological tissues.

2019-11-15

Continuous State HMMs for Modeling Time Series Single Cell RNA-Seq Data

Lin C, Bar-Joseph Z.

HIVE TC-CMU

MOTIVATION: Methods for reconstructing developmental trajectories from time series single cell RNA-Seq (scRNA-Seq) data can be largely divided into two categories. The first, often referred to as pseudotime ordering methods, are deterministic and rely on dimensionality reduction followed by an ordering step. The second learns a probabilistic branching model to represent the developmental process. While both types have been successful, each suffers from shortcomings that can impact their accuracy. RESULTS: We developed a new method based on continuous state HMMs (CSHMMs) for representing and modeling time series scRNA-Seq data. We define the CSHMM model and provide efficient learning and inference algorithms which allow the method to determine both the structure of the branching process and the assignment of cells to these branches. Analyzing several developmental single cell datasets we show that the CSHMM method accurately infers branching topology and correctly and continuously assign cells to paths, improving upon prior methods proposed for this task. Analysis of genes based on the continuous cell assignment identifies known and novel markers for different cell types. AVAILABILITY: Software and Supporting website: www.andrew.cmu.edu/user/chiehl1/CSHMM/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2019-12-12

Toward a Common Coordinate Framework for the Human Body

Rood JE, Stuart T, Ghazanfar S, Biancalani T, Fisher E, Butler A, Hupalowska A, Gaffney L, Mauck W, Eraslan G, Marioni JC, Regev A, Satija R.

HIVE MC-NYGC

2019-12-20

Uncovering matrix effects on lipid analyses in MALDI imaging mass spectrometry experiments

Perry WJ, Patterson NH, Prentice BM, Neumann EK, Caprioli RM, Spraggins JM.

TMC-Vanderbilt (Kidney)

The specific matrix used in matrix‐assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) can have an effect on the molecules ionized from a tissue sample. The sensitivity for distinct classes of biomolecules can vary when employing different MALDI matrices. Here, we compare the intensities of various lipid subclasses measured by Fourier transform ion cyclotron resonance (FT‐ICR) IMS of murine liver tissue when using 9‐aminoacridine (9AA), 5‐chloro‐2‐mercaptobenzothiazole (CMBT), 1,5‐diaminonaphthalene (DAN), 2,5‐Dihydroxyacetophenone (DHA), and 2,5‐dihydroxybenzoic acid (DHB). Principal component analysis and receiver operating characteristic curve analysis revealed significant matrix effects on the relative signal intensities observed for different lipid subclasses and adducts. Comparison of spectral profiles and quantitative assessment of the number and intensity of species from each lipid subclass showed that each matrix produces unique lipid signals. In positive ion mode, matrix application methods played a role in the MALDI analysis for different cationic species. Comparisons of different methods for the application of DHA showed a significant increase in the intensity of sodiated and potassiated analytes when using an aerosol sprayer. In negative ion mode, lipid profiles generated using DAN were significantly different than all other matrices tested. This difference was found to be driven by modification of phosphatidylcholines during ionization that enables them to be detected in negative ion mode. These modified phosphatidylcholines are isomeric with common phosphatidylethanolamines confounding MALDI IMS analysis when using DAN. These results show an experimental basis of MALDI analyses when analyzing lipids from tissue and allow for more informed selection of MALDI matrices when performing lipid IMS experiments.

2019-12-23

Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression

Hafemeister C, Satija R.

HIVE MC-NYGC

Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from “regularized negative binomial regression,” where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.

2019-12-26

Deep learning for inferring gene relationships from single-cell expression data

Yuan Y, Bar-Joseph Z.

HIVE TC-CMU

Several methods were developed to mine gene–gene relationships from expression data. Examples include correlation and mutual information methods for coexpression analysis, clustering and undirected graphical models for functional assignments, and directed graphical models for pathway reconstruction. Using an encoding for gene expression data, followed by deep neural networks analysis, we present a framework that can successfully address all of these diverse tasks. We show that our method, convolutional neural network for coexpression (CNNC), improves upon prior methods in tasks ranging from predicting transcription factor targets to identifying disease-related genes to causality inference. CNNC’s encoding provides insights about some of the decisions it makes and their biological basis. CNNC is flexible and can easily be extended to integrate additional types of genomics data, leading to further improvements in its performance.

2020-01-07

Automated mass spectrometry imaging of over 2000 proteins from tissue sections at 100-Î¼m spatial resolution

Piehowski PD, Zhu Y, Bramer LM, Stratton KG, Zhao R, Orton DJ, Moore RJ, Yuan J, Mitchell HD, Gao Y, Webb-Robertson BM, Dey SK, Kelly RT, Burnum-Johnson KE.

TTD-Purdue

Biological tissues exhibit complex spatial heterogeneity that directs the functions of multicellular organisms. Quantifying protein expression is essential for elucidating processes within complex biological assemblies. Imaging mass spectrometry (IMS) is a powerful emerging tool for mapping the spatial distribution of metabolites and lipids across tissue surfaces, but technical challenges have limited the application of IMS to the analysis of proteomes. Methods for probing the spatial distribution of the proteome have generally relied on the use of labels and/or antibodies, which limits multiplexing and requires a priori knowledge of protein targets. Past efforts to make spatially resolved proteome measurements across tissues have had limited spatial resolution and proteome coverage and have relied on manual workflows. Here, we demonstrate an automated approach to imaging that utilizes label-free nanoproteomics to analyze tissue voxels, generating quantitative cell-type-specific images for >2000 proteins with 100-µm spatial resolution across mouse uterine tissue sections preparing for blastocyst implantation.

2020-02-18

Inferring TF activation order in time series scRNA-Seq studies

Lin C, Ding J, Bar-Joseph Z.

HIVE TC-CMU

Methods for the analysis of time series single cell expression data (scRNA-Seq) either do not utilize information about transcription factors (TFs) and their targets or only study these as a post-processing step. Using such information can both, improve the accuracy of the reconstructed model and cell assignments, while at the same time provide information on how and when the process is regulated. We developed the Continuous-State Hidden Markov Models TF (CSHMM-TF) method which integrates probabilistic modeling of scRNA-Seq data with the ability to assign TFs to specific activation points in the model. TFs are assumed to influence the emission probabilities for cells assigned to later time points allowing us to identify not just the TFs controlling each path but also their order of activation. We tested CSHMM-TF on several mouse and human datasets. As we show, the method was able to identify known and novel TFs for all processes, assigned time of activation agrees with both expression information and prior knowledge and combinatorial predictions are supported by known interactions. We also show that CSHMM-TF improves upon prior methods that do not utilize TF-gene interaction

2020-02-28

Immune monitoring using mass cytometry and related high-dimensional imaging approaches

Hartmann FJ, Bendall SC.

RTI-Stanford

The cellular complexity and functional diversity of the human immune system necessitate the use of high-dimensional single-cell tools to uncover its role in multifaceted diseases such as rheumatic diseases, as well as other autoimmune and inflammatory disorders. Proteomic technologies that use elemental (heavy metal) reporter ions, such as mass cytometry (also known as CyTOF) and analogous high-dimensional imaging approaches (including multiplexed ion beam imaging (MIBI) and imaging mass cytometry (IMC)), have been developed from their low-dimensional counterparts, flow cytometry and immunohistochemistry, to meet this need. A growing number of studies have been published that use these technologies to identify functional biomarkers and therapeutic targets in rheumatic diseases, but the full potential of their application to rheumatic disease research has yet to be fulfilled. This Review introduces the underlying technologies for high-dimensional immune monitoring and discusses aspects necessary for their successful implementation, including study design principles, analytical tools and future developments for the field of rheumatology.

2020-03-11

Multiplexed single-cell morphometry for hematopathology diagnostics

Tsai AG, Glass DR, Juntilla M, Hartmann FJ, Oak JS, Fernandez-Pol S, Ohgami RS, Bendall SC.

RTI-Stanford

The diagnosis of lymphomas and leukemias requires hematopathologists to integrate microscopically visible cellular morphology with antibody-identified cell surface molecule expression. To merge these into one high-throughput, highly multiplexed, single-cell assay, we quantify cell morphological features by their underlying, antibody-measurable molecular components, which empowers mass cytometers to ‘see’ like pathologists. When applied to 71 diverse clinical samples, single-cell morphometric profiling reveals robust and distinct patterns of ‘morphometric’ markers for each major cell type. Individually, lamin B1 highlights acute leukemias, lamin A/C helps distinguish normal from neoplastic mature T cells, and VAMP-7 recapitulates light-cytometric side scatter. Combined with machine learning, morphometric markers form intuitive visualizations of normal and neoplastic cellular distribution and differentiation. When recalibrated for myelomonocytic blast enumeration, this approach is superior to flow cytometry and comparable to expert microscopy, bypassing years of specialized training. The contextualization of traditional surface markers on independent morphometric frameworks permits more sensitive and automated diagnosis of complex hematopoietic diseases.

2020-03-13

Considerations for Using the Vasculature as a Coordinate System to Map All the Cells in the Human Body

Weber, GM, Ju, Y, Börner K.

HIVE MC-IU

Several ongoing international efforts are developing methods of localizing single cells within organs or mapping the entire human body at the single cell level, including the Chan Zuckerberg Initiative’s Human Cell Atlas (HCA), and the Knut and Allice Wallenberg Foundation’s Human Protein Atlas (HPA), and the National Institutes of Health’s Human BioMolecular Atlas Program (HuBMAP). Their goals are to understand cell specialization, interactions, spatial organization in their natural context, and ultimately the function of every cell within the body. In the same way that the Human Genome Project had to assemble sequence data from different people to construct a complete sequence, multiple centers around the world are collecting tissue specimens from diverse populations that vary in age, race, sex, and body size. A challenge will be combining these heterogeneous tissue samples into a 3D reference map that will enable multiscale, multidimensional Google Maps-like exploration of the human body. Key to making alignment of tissue samples work is identifying and using a coordinate system called a Common Coordinate Framework (CCF), which defines the positions, or “addresses,” in a reference body, from whole organs down to functional tissue units and individual cells. In this perspective, we examine the concept of a CCF based on the vasculature and describe why it would be an attractive choice for mapping the human body.

2020-03-27

Tools for the analysis of high-dimensional single-cell RNA sequencing data

Wu Y, Zhang K.

TMC-UCSD

Breakthroughs in the development of high-throughput technologies for profiling transcriptomes at the single-cell level have helped biologists to understand the heterogeneity of cell populations, disease states and developmental lineages. However, these single-cell RNA sequencing (scRNA-seq) technologies generate an extraordinary amount of data, which creates analysis and interpretation challenges. Additionally, scRNA-seq datasets often contain technical sources of noise owing to incomplete RNA capture, PCR amplification biases and/or batch effects specific to the patient or sample. If not addressed, this technical noise can bias the analysis and interpretation of the data. In response to these challenges, a suite of computational tools has been developed to process, analyse and visualize scRNA-seq datasets. Although the specific steps of any given scRNA-seq analysis might differ depending on the biological questions being asked, a core workflow is used in most analyses. Typically, raw sequencing reads are processed into a gene expression matrix that is then normalized and scaled to remove technical noise. Next, cells are grouped according to similarities in their patterns of gene expression, which can be summarized in two or three dimensions for visualization on a scatterplot. These data can then be further analysed to provide an in-depth view of the cell types or developmental trajectories in the sample of interest.

2020-04-01

Integrated molecular imaging technologies for investigation of metals in biological systems: A brief review

Perry WJ, Weiss A, Van de Plas R, Spraggins JM, Caprioli RM, Skaar EP.

TMC-Vanderbilt (Kidney)

Metals play an essential role in biological systems and are required as structural or catalytic co-factors in many proteins. Disruption of the homeostatic control and/or spatial distributions of metals can lead to disease. Imaging technologies have been developed to visualize elemental distributions across a biological sample. Measurement of elemental distributions by imaging mass spectrometry and imaging X-ray fluorescence are increasingly employed with technologies that can assess histological features and molecular compositions. Data from several modalities can be interrogated as multimodal images to correlate morphological, elemental, and molecular properties. Elemental and molecular distributions have also been axially resolved to achieve three-dimensional volumes, dramatically increasing the biological information. In this review, we provide an overview of recent developments in the field of metal imaging with an emphasis on multimodal studies in two and three dimensions. We specifically highlight studies that present technological advancements and biological applications of how metal homeostasis affects human health.

2020-04-02

Reconstructed Single-Cell Fate Trajectories Define Lineage Plasticity Windows during Differentiation of Human PSC-Derived Distal Lung Progenitors

Hurley K, Ding J, Villacorta-Martin C, Herriges MJ, Jacob A, Vedaie M, Alysandratos KD, Sun YL, Lin C, Werder RB, Huang J, Wilson AA, Mithal A, Mostoslavsky G, Oglesby I, Caballero IS, Guttentag SH, Ahangari F, Kaminski N, Rodriguez-Fraticelli A, Camargo F, Bar-Joseph Z, Kotton DN.

HIVE TC-CMU

Alveolar epithelial type 2 cells (AEC2s) are the facultative progenitors responsible for maintaining lung alveoli throughout life but are difficult to isolate from patients. Here, we engineer AEC2s from human pluripotent stem cells (PSCs) in vitro and use time-series single-cell RNA sequencing with lentiviral barcoding to profile the kinetics of their differentiation in comparison to primary fetal and adult AEC2 benchmarks. We observe bifurcating cell-fate trajectories as primordial lung progenitors differentiate in vitro, with some progeny reaching their AEC2 fate target, while others diverge to alternative non-lung endodermal fates. We develop a Continuous State Hidden Markov model to identify the timing and type of signals, such as overexuberant Wnt responses, that induce some early multipotent NKX2-1+ progenitors to lose lung fate. Finally, we find that this initial developmental plasticity is regulatable and subsides over time, ultimately resulting in PSC-derived AEC2s that exhibit a stable phenotype and nearly limitless self-renewal capacity.

2020-05-01

Unsupervised machine learning for exploratory data analysis in imaging mass spectrometry

Verbeeck N, Caprioli RM, Van de Plas R.

TMC-Vanderbilt (Kidney)

Imaging mass spectrometry (IMS) is a rapidly advancing molecular imaging modality that can map the spatial distribution of molecules with high chemical specificity. IMS does not require prior tagging of molecular targets and is able to measure a large number of ions concurrently in a single experiment. While this makes it particularly suited for exploratory analysis, the large amount and high‐dimensional nature of data generated by IMS techniques make automated computational analysis indispensable. Research into computational methods for IMS data has touched upon different aspects, including spectral preprocessing, data formats, dimensionality reduction, spatial registration, sample classification, differential analysis between IMS experiments, and data‐driven fusion methods to extract patterns corroborated by both IMS and other imaging modalities. In this work, we review unsupervised machine learning methods for exploratory analysis of IMS data, with particular focus on (a) factorization, (b) clustering, and (c) manifold learning. To provide a view across the various IMS modalities, we have attempted to include examples from a range of approaches including matrix assisted laser desorption/ionization, desorption electrospray ionization, and secondary ion mass spectrometry‐based IMS. This review aims to be an entry point for both (i) analytical chemists and mass spectrometry experts who want to explore computational techniques; and (ii) computer scientists and data mining specialists who want to enter the IMS field.

2020-05-14

Use of Single Cell -omic Technologies to Study the Gastrointestinal Tract and Diseases, From Single Cell Identities to Patient Features

Islam M, Chen B, Spraggins JM, Kelly RT, Lau KS.

TMC-Vanderbilt (Kidney)

Single cells are the building blocks of tissue systems that determine organ phenotypes, behaviors, and function. Understanding the differences between cell types and their activities might provide us with insights into normal tissue functions, development of disease, and new therapeutic strategies. Although -omic level single cell technologies are a relatively recent development that been used only in laboratory studies, these approaches might eventually be used in the clinic. We review the prospects of applying single cell genome, transcriptome, epigenome, proteome, and metabolome analyses to gastroenterology and hepatology research. Combining data from multi-omic platforms and rapid technological developments could lead to new diagnostic, prognostic, and therapeutic approaches.

2020-05-19

Discovering New Lipidomic Features Using Cell Type Specific Fluorophore Expression to Provide Spatial and Biological Specificity in a Multimodal Workflow with MALDI Imaging Mass Spectrometry

Jones MA, Cho SH, Patterson NH, Van de Plas R, Spraggins JM, Boothby MR, Caprioli RM.

TMC-Vanderbilt (Kidney)

Identifying the spatial distributions of biomolecules in tissue is crucial for understanding integrated function. Imaging mass spectrometry (IMS) allows simultaneous mapping of thousands of biosynthetic products such as lipids but has needed a means of identifying specific cell-types or functional states to correlate with molecular localization. We report, here, advances starting from identity marking with a genetically encoded fluorophore. The fluorescence emission data were integrated with IMS data through multimodal image processing with advanced registration techniques and data-driven image fusion. In an unbiased analysis of spleens, this integrated technology enabled identification of ether lipid species preferentially enriched in germinal centers. We propose that this use of genetic marking for microanatomical regions of interest can be paired with molecular information from IMS for any tissue, cell-type, or activity state for which fluorescence is driven by a gene-tracking allele and ultimately with outputs of other means of spatial mapping.

2020-06-16

Single-cell Lineage Tracing by Integrating CRISPR-Cas9 Mutations With Transcriptomic Data

Zafar H, Lin C, Bar-Joseph Z.

HIVE TC-CMU

Recent studies combine two novel technologies, single-cell RNA-sequencing and CRISPR-Cas9 barcode editing for elucidating developmental lineages at the whole organism level. While these studies provided several insights, they face several computational challenges. First, lineages are reconstructed based on noisy and often saturated random mutation data. Additionally, due to the randomness of the mutations, lineages from multiple experiments cannot be combined to reconstruct a species-invariant lineage tree. To address these issues we developed a statistical method, LinTIMaT, which reconstructs cell lineages using a maximum-likelihood framework by integrating mutation and expression data. Our analysis shows that expression data helps resolve the ambiguities arising in when lineages are inferred based on mutations alone, while also enabling the integration of different individual lineages for the reconstruction of an invariant lineage tree. LinTIMaT lineages have better cell type coherence, improve the functional significance of gene sets and provide new insights on progenitors and differentiation pathways.

2020-06-30

A Cancer Biologist's Primer on Machine Learning Applications in High-Dimensional Cytometry

Keyes TJ, Domizi P, Lo YC, Nolan GP, Davis KL

TMC-Stanford

The application of machine learning and artificial intelligence to high-dimensional cytometry data sets has increasingly become a staple of bioinformatic data analysis over the past decade. This is especially true in the field of cancer biology, where protocols for collecting multiparameter single-cell data in a high-throughput fashion are rapidly developed. As the use of machine learning methodology in cytometry becomes increasingly common, there is a need for cancer biologists to understand the basic theory and applications of a variety of algorithmic tools for analyzing and interpreting cytometry data. We introduce the reader to several keystone machine learning-based analytic approaches with an emphasis on defining key terms and introducing a conceptual framework for making translational or clinically relevant discoveries. The target audience consists of cancer cell biologists and physician-scientists interested in applying these tools to their own data, but who may have limited training in bioinformatics. © 2020 International Society for Advancement of Cytometry.

2020-07-14

An Integrated Multi-omic Single-Cell Atlas of Human B Cell Identity.

Glass DR, Tsai AG, Oliveria JP, Hartmann FJ, Kimmey SC, Calderon AA, Borges L, Glass MC, Wagar LE, Davis MM, Bendall SC.

RTI-Stanford

B cells are capable of a wide range of effector functions including antibody secretion, antigen presentation, cytokine production, and generation of immunological memory. A consistent strategy for classifying human B cells by using surface molecules is essential to harness this functional diversity for clinical translation. We developed a highly multiplexed screen to quantify the co-expression of 351 surface molecules on millions of human B cells. We identified differentially expressed molecules and aligned their variance with isotype usage, VDJ sequence, metabolic profile, biosynthesis activity, and signaling response. Based on these analyses, we propose a classification scheme to segregate B cells from four lymphoid tissues into twelve unique subsets, including a CD45RB⁺CD27^- early memory population, a class-switched CD39⁺ tonsil-resident population, and a CD19^hiCD11c⁺ memory population that potently responds to immune activation. This classification framework and underlying datasets provide a resource for further investigations of human B cell identity and function.

2020-07-16

Localization of the lens intermediate filament switch by imaging mass spectrometry

Wang Z, Ryan DJ, Schey KL

TMC-Vanderbilt (Eye/pancreas)

Imaging mass spectrometry (IMS) enables targeted and untargeted visualization of the spatial localization of molecules in tissues with great specificity. The lens is a unique tissue that contains fiber cells corresponding to various stages of differentiation that are packed in a highly spatial order. The application of IMS to lens tissue localizes molecular features that are spatially related to the fiber cell organization. Such spatially resolved molecular information assists our understanding of lens structure and physiology; however, protein IMS studies are typically limited to abundant, soluble, low molecular weight proteins. In this study, a method was developed for imaging low solubility cytoskeletal proteins in the lens; a tissue that is filled with high concentrations of soluble crystallins. Optimized tissue washes combined with on-tissue enzymatic digestion allowed successful imaging of peptides corresponding to known lens cytoskeletal proteins. The resulting peptide signals facilitated segmentation of the bovine lens into molecularly distinct regions. A sharp intermediate filament transition from vimentin to lens-specific beaded filament proteins was detected in the lens cortex. MALDI IMS also revealed the region where posttranslational myristoylation of filensin occurs and the results indicate that truncation and myristoylation of filensin starts soon after filensin expression increased in the inner cortex. From intermediate filament switch to filensin truncation and myristoylation, multiple remarkable changes occur in the narrow region of lens cortex. MALDI images delineated the boundaries of distinct lens regions that will guide further proteomic and interactomic studies.

2020-07-23

Multimodal Analysis of Composition and Spatial Architecture in Human Squamous Cell Carcinoma

Ji AL, Rubin AJ, Thrane K, Jiang S, Reynolds DL, Meyers RM, Guo MG, George BM, Mollbrink A, Bergenstråhle J, Larsson L, Bai Y, Zhu B, Bhaduri A, Meyers JM, Rovira-Clavé X, Hollmig ST, Aasi SZ, Nolan GP, Lundeberg J, Khavari PA

TMC-Stanford

To define the cellular composition and architecture of cutaneous squamous cell carcinoma (cSCC), we combined single-cell RNA sequencing with spatial transcriptomics and multiplexed ion beam imaging from a series of human cSCCs and matched normal skin. cSCC exhibited four tumor subpopulations, three recapitulating normal epidermal states, and a tumor-specific keratinocyte (TSK) population unique to cancer, which localized to a fibrovascular niche. Integration of single-cell and spatial data mapped ligand-receptor networks to specific cell types, revealing TSK cells as a hub for intercellular communication. Multiple features of potential immunosuppression were observed, including T regulatory cell (Treg) co-localization with CD8 T cells in compartmentalized tumor stroma. Finally, single-cell characterization of human tumor xenografts and in vivo CRISPR screens identified essential roles for specific tumor subpopulation-enriched gene networks in tumorigenesis. These data define cSCC tumor and stromal cell subpopulations, the spatial niches where they interact, and the communicating gene networks that they engage in cancer.

2020-08-31

Single-cell metabolic profiling of human cytotoxic T cells

Hartmann FJ, Mrdjen D, McCaffrey E, Glass DR, Greenwald NF, Bharadwaj A, Khair Z, Verberk SGS, Baranski A, Baskar R, Graf W, Van Valen D, Van den Bossche J, Angelo M, Bendall SC.

RTI-Stanford

Cellular metabolism regulates immune cell activation, differentiation and effector functions, but current metabolic approaches lack single-cell resolution and simultaneous characterization of cellular phenotype. In this study, we developed an approach to characterize the metabolic regulome of single cells together with their phenotypic identity. The method, termed single-cell metabolic regulome profiling (scMEP), quantifies proteins that regulate metabolic pathway activity using high-dimensional antibody-based technologies. We employed mass cytometry (cytometry by time of flight, CyTOF) to benchmark scMEP against bulk metabolic assays by reconstructing the metabolic remodeling of in vitro-activated naive and memory CD8⁺ T cells. We applied the approach to clinical samples and identified tissue-restricted, metabolically repressed cytotoxic T cells in human colorectal carcinoma. Combining our method with multiplexed ion beam imaging by time of flight (MIBI-TOF), we uncovered the spatial organization of metabolic programs in human tissues, which indicated exclusion of metabolically repressed immune cells from the tumor-immune boundary. Overall, our approach enables robust approximation of metabolic and functional states in individual cells.

2020-09-01

Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes

Shafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE, Bosworth C, Armstrong J, Tigyi K, Maurer N, Koren S, Sedlazeck FJ, Marschall T, Mayes S, Costa V, Zook JM, Liu KJ, Kilburn D, Sorensen M, Munson KM, Vollger MR, Monlong J, Garrison E, Eichler EE, Salama S, Haussler D, Green RE, Akeson M, Phillippy A, Miga KH, Carnevali P, Jain M, Paten B

HIVE TC-CMU

De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed.

2020-09-01

Changes to Zonular Tension Alters the Subcellular Distribution of AQP5 in Regions of Influx and Efflux of Water in the Rat Lens

Petrova RS, Bavana N, Zhao R, Schey KL, Donaldson PJ

TMC-Vanderbilt (Eye/pancreas)

Purpose: The lens uses circulating fluxes of ions and water that enter the lens at both poles and exit at the equator to maintain its optical properties. We have mapped the subcellular distribution of the lens aquaporins (AQP0, AQP1, and AQP5) in these water influx and efflux zones and investigated how their membrane location is affected by changes in tension applied to the lens by the zonules. Methods: Immunohistochemistry using AQP antibodies was performed on axial sections obtained from rat lenses that had been removed from the eye and then fixed or were fixed in situ to maintain zonular tension. Zonular tension was pharmacologically modulated by applying either tropicamide (increased) or pilocarpine (decreased). AQP labeling was visualized using confocal microscopy. Results: Modulation of zonular tension had no effect on AQP1 or AQP0 labeling in either the water efflux or influx zones. In contrast, AQP5 labeling changed from membranous to cytoplasmic in response to both mechanical and pharmacologically induced reductions in zonular tension in both the efflux zone and anterior (but not posterior) influx zone associated with the lens sutures. Conclusions: Altering zonular tension dynamically regulates the membrane trafficking of AQP5 in the efflux and anterior influx zones to potentially change the magnitude of circulating water fluxes in the lens.

2020-09-03

Coordinated Cellular Neighborhoods Orchestrate Antitumoral Immunity at the Colorectal Cancer Invasive Front

Schürch CM, Bhate SS, Barlow GL, Phillips DJ, Noti L, Zlobec I, Chu P, Black S, Demeter J, McIlwain DR, Kinoshita S, Samusik N, Goltsev Y, Nolan GP.

TMC-Stanford

Antitumoral immunity requires organized, spatially nuanced interactions between components of the immune tumor microenvironment (iTME). Understanding this coordinated behavior in effective versus ineffective tumor control will advance immunotherapies. We re-engineered co-detection by indexing (CODEX) for paraffin-embedded tissue microarrays, enabling simultaneous profiling of 140 tissue regions from 35 advanced-stage colorectal cancer (CRC) patients with 56 protein markers. We identified nine conserved, distinct cellular neighborhoods (CNs)-a collection of components characteristic of the CRC iTME. Enrichment of PD-1⁺CD4⁺ T cells only within a granulocyte CN positively correlated with survival in a high-risk patient subset. Coupling of tumor and immune CNs, fragmentation of T cell and macrophage CNs, and disruption of inter-CN communication was associated with inferior outcomes. This study provides a framework for interrogating how complex biological processes, such as antitumoral immunity, occur through concerted actions of cells and spatial domains.

2020-09-04

The impact of air transport availability on research collaboration: A case study of four universities

Ploszaj A, Yan X, Börner K.

HIVE MC-IU

This paper analyzes the impact of air transport connectivity and accessibility on scientific collaboration. Numerous studies demonstrated that the likelihood of collaboration declines with increase in distance between potential collaborators. These works commonly use simple measures of physical distance rather than actual flight capacity and frequency. Our study addresses this limitation by focusing on the relationship between flight availability and the number of scientific co-publications. Furthermore, we distinguish two components of flight availability: (1) direct and indirect air connections between airports; and (2) distance to the nearest airport from cities and towns where authors of scientific articles have their professional affiliations. Based on Zero-inflated Negative Binomial Regression, we provide evidence that greater flight availability is associated with more frequent scientific collaboration. More flight connections (connectivity) and proximity of airport (accessibility) increase the expected number of coauthored scientific papers. Moreover, direct flights and flights with one transfer are more valuable for intensifying scientific cooperation than travels involving more connecting flights. Further, analysis of four organizational sub-datasets-Arizona State University, Indiana University Bloomington, Indiana University-Purdue University Indianapolis, and University of Michigan-shows that the relationship between airline transport availability and scientific collaboration is not uniform, but is associated with the research profile of an institution and the characteristics of the airport that serves this institution.

2020-09-29

Targeting Phosphotyrosine in Native Proteins with Conditional, Bispecific Antibody Traps

Zhou XX, Bracken CJ, Zhang K, Zhou J, Mou Y, Wang L, Cheng Y, Leung KK, Wells JA.

RTI-Northwestern

Engineering sequence-specific antibodies (Abs) against phosphotyrosine (pY) motifs embedded in folded polypeptides remains highly challenging because of the stringent requirement for simultaneous recognition of the pY motif and the surrounding folded protein epitope. Here, we present a method named phosphotyrosine Targeting by Recombinant Ab Pair, or pY-TRAP, for in vitro engineering of binders for native pY proteins. Specifically, we create the pY protein by unnatural amino acid misincorporation, mutagenize a universal pY-binding Ab to create a first binder B1 for the pY motif on the pY protein, and then select against the B1-pY protein complex for a second binder B2 that recognizes the composite epitope of B1 and the pY-containing protein complex. We applied pY-TRAP to create highly specific binders to folded Ub-pY59, a rarely studied Ub phosphoform exclusively observed in cancerous tissues, and ZAP70-pY248, a kinase phosphoform regulated in feedback signaling pathways in T cells. The pY-TRAPs do not have detectable binding to wild-type proteins or to other pY peptides or proteins tested. This pY-TRAP approach serves as a generalizable method for engineering sequence-specific Ab binders to native pY proteins.

2020-10-06

Spatial metabolomics of the human kidney using MALDI trapped ion mobility imaging mass spectrometry

Neumann EK, Migas LG, Allen JL, Caprioli RM, Van de Plas R, Spraggins JM

TMC-Vanderbilt (Kidney)

Low molecular weight metabolites are essential for defining the molecular phenotypes of cells. However, spatial metabolomics tools often lack the sensitivity, specify, and spatial resolution to provide comprehensive descriptions of these species in tissue. MALDI imaging mass spectrometry (IMS) of low molecular weight ions is particularly challenging as MALDI matrix clusters are often nominally isobaric with multiple metabolite ions, requiring high resolving power instrumentation or derivatization to circumvent this issue. An alternative to this is to perform ion mobility separation before ion detection, enabling the visualization of metabolites without the interference of matrix ions. Additional difficulties surrounding low weight metabolite visualization include high resolution imaging, while maintaining sufficient ion numbers for broad and representative analysis of the tissue chemical complement. Here, we use MALDI timsTOF IMS to image low molecular weight metabolites at higher spatial resolution than most metabolite MALDI IMS experiments (20 μm) while maintaining broad coverage within the human kidney. We demonstrate that trapped ion mobility spectrometry (TIMS) can resolve matrix peaks from metabolite signal and separate both isobaric and isomeric metabolites with different distributions within the kidney. The added ion mobility data dimension dramatically increased the peak capacity for spatial metabolomics experiments. Through this improved sensitivity, we have found >40 low molecular weight metabolites in human kidney tissue, such as argininic acid, acetylcarnitine, and choline that localize to the cortex, medulla, and renal pelvis, respectively. Future work will involve further exploring metabolomic profiles of human kidneys as a function of age, sex, and race.

2020-10-09

An Integrated Microfluidic Probe for Mass Spectrometry Imaging of Biological Samples*

Li X, Yin R, Hu H, Li Y, Sun X, Dey SK, Laskin J.

TTD-Purdue

Ambient ionization based on liquid extraction is widely used in mass spectrometry imaging (MSI) of molecules in biological samples. The development of nanospray desorption electrospray ionization (nano-DESI) has enabled the robust imaging of tissue sections with high spatial resolution. However, the fabrication of the nano-DESI probe is challenging, which limits its dissemination to the broader scientific community. Herein, we describe the design and performance of an integrated microfluidic probe (iMFP) for nano-DESI MSI. The glass iMFP, fabricated using photolithography, wet etching, and polishing, shows comparable performance to the capillary-based nano-DESI MSI in terms of stability and sensitivity; a spatial resolution of better than 25 μm was obtained in these first proof-of-principle experiments. The iMFP is easy to operate and align in front of a mass spectrometer, which will facilitate broader use of liquid-extraction-based MSI in biological research, drug discovery, and clinical studies.

2020-10-19

CDKL5: a promising new therapeutic target for acute kidney injury?

de Caestecker MP.

TMC-Vanderbilt (Kidney)

Online ahead of print. No abstract available.

2020-10-27

Iterative point set registration for aligning scRNA-seq data

Alavi A, Bar-Joseph Z

HIVE TC-CMU

Several studies profile similar single cell RNA-Seq (scRNA-Seq) data using different technologies and platforms. A number of alignment methods have been developed to enable the integration and comparison of scRNA-Seq data from such studies. While each performs well on some of the datasets, to date no method was able to both perform the alignment using the original expression space and generalize to new data. To enable such analysis we developed Single Cell Iterative Point set Registration (SCIPR) which extends methods that were successfully applied to align image data to scRNA-Seq. We discuss the required changes needed, the resulting optimization function, and algorithms for learning a transformation function for aligning data. We tested SCIPR on several scRNA-Seq datasets. As we show it successfully aligns data from several different cell types, improving upon prior methods proposed for this task. In addition, we show the parameters learned by SCIPR can be used to align data not used in the training and to identify key cell type-specific genes.

2020-11-01

High-Parameter Immune Profiling with CyTOF

Sahaf B, Rahman A, Maecker HT, Bendall SC

RTI-Stanford

Mass cytometry, or CyTOF, is a useful technology for high-parameter single-cell phenotyping, especially from suspension cells such as blood or PBMC. It is particularly appealing to monitor the systemic immune changes that could accompany cancer immunotherapy. Here we present a reference panel for identification of all major immune cell populations, with flexibility for addition of trial-specific markers. We also describe best-practice measures for minimizing and tracking batch variability. These include: sample barcoding, use of spiked-in reference cells, and lyophilization of the antibody cocktail. Our protocol assumes the use of cryopreserved PBMC, both for convenience of batching samples and for maximum comparability across patients and time points. Finally, we show an option for automated analysis using the Astrolabe platform (Astrolabe Diagnostics, Inc.).

2020-11-02

Landscape of coordinated immune responses to H1N1 challenge in humans.

Rahil Z, Leylek R, Schürch CM, Chen H, Bjornson-Hooper Z, Christensen SR, Gherardini PF, Bhate SS, Spitzer MH, Fragiadakis GK, Mukherjee N, Kim N, Jiang S, Yo J, Gaudilliere B, Affrime M, Bock B, Hensley SE, Idoyaga J, Aghaeepour N, Kim K, Nolan GP, McIlwain DR.

TMC-Stanford

Influenza is a significant cause of morbidity and mortality worldwide. Here we show changes in the abundance and activation states of more than 50 immune cell subsets in 35 individuals over 11 time points during human A/California/2009 (H1N1) virus challenge monitored using mass cytometry along with other clinical assessments. Peak change in monocyte, B cell, and T cell subset frequencies coincided with peak virus shedding, followed by marked activation of T and NK cells. Results led to the identification of CD38 as a critical regulator of plasmacytoid dendritic cell function in response to influenza virus. Machine learning using study-derived clinical parameters and single-cell data effectively classified and predicted susceptibility to infection. The coordinated immune cell dynamics defined in this study provide a framework for identifying novel correlates of protection in the evaluation of future influenza therapeutics.

2020-11-05

Advances in Proximity Ligation in situ Hybridization (PLISH)

Nagendran M, Andruska AM, Harbury PB, Desai TJ.

TTD-Stanford

Understanding tissues in the context of development, maintenance and disease requires determining the molecular profiles of individual cells within their native in vivo spatial context. We developed a Proximity Ligation in situ Hybridization technology (PLISH) that enables quantitative measurement of single cell gene expression in intact tissues, which we have now updated. By recording spatial information for every profiled cell, PLISH enables retrospective mapping of distinct cell classes and inference of their in vivo interactions. PLISH has high sensitivity, specificity and signal to noise ratio. It is also rapid, scalable, and does not require expertise in molecular biology so it can be easily adopted by basic and clinical researchers.

2020-11-06

Carrier-assisted One-pot Sample Preparation for Targeted Proteomics Analysis of Small Numbers of Human Cells

Martin K, Zhang T, Zhang P, Chrisler WB, Thomas FL, Liu F, Liu T, Qian WJ, Smith RD, Shi T.

TTD-PNNL/Northwestern

Protein analysis of small numbers of human cells is primarily achieved by targeted proteomics with antibody-based immunoassays, which have inherent limitations (e.g., low multiplex and unavailability of antibodies for new proteins). Mass spectrometry (MS)-based targeted proteomics has emerged as an alternative because it is antibody-free, high multiplex, and has high specificity and quantitation accuracy. Recent advances in MS instrumentation make MS-based targeted proteomics possible for multiplexed quantification of highly abundant proteins in single cells. However, there is a technical challenge for effective processing of single cells with minimal sample loss for MS analysis. To address this issue, we have recently developed a convenient protein carrier-assisted one-pot sample preparation coupled with liquid chromatography (LC) - selected reaction monitoring (SRM) termed cLC-SRM for targeted proteomics analysis of small numbers of human cells. This method capitalizes on using the combined excessive exogenous protein as a carrier and low-volume one-pot processing to greatly reduce surface adsorption losses as well as high-specificity LC-SRM to effectively address the increased dynamic concentration range due to the addition of exogeneous carrier protein. Its utility has been demonstrated by accurate quantification of most moderately abundant proteins in small numbers of cells (e.g., 10-100 cells) and highly abundant proteins in single cells. The easy-to-implement features and no need for specific devices make this method readily accessible to most proteomics laboratories. Herein we have provided a detailed protocol for cLC-SRM analysis of small numbers of human cells including cell sorting, cell lysis and digestion, LC-SRM analysis, and data analysis. Further improvements in detection sensitivity and sample throughput are needed towards targeted single-cell proteomics analysis. We anticipate that cLC-SRM will be broadly applied to biomedical research and systems biology with the potential of facilitating precision medicine.

2020-11-13

Guidelines for reporting single-cell RNA-seq experiments.

Füllgrabe A, George N, Green M, Nejad P, Aronow B, Fexova SK, Fischer C, Freeberg MA, Huerta L, Morrison N, Scheuermann RH, Taylor D, Vasilevsky N, Clarke L, Gehlenborg N, Kent J, Marioni J, Teichmann S, Brazma A, Papatheodorou I

HIVE TC-Harvard

No abstract available.

2020-11-30

Tetraspanin-7 regulation of L-type voltage-dependent calcium channels controls pancreatic β-cell insulin secretion

Dickerson MT, Dadi PK, Butterworth RB, Nakhe AY, Graff SM, Zaborska KE, Schaub CM, Jacobson DA

TMC-Vanderbilt (Eye/pancreas)

Key points: Tetraspanin (TSPAN) proteins regulate many biological processes, including intracellular calcium (Ca²⁺ ) handling. TSPAN-7 is enriched in pancreatic islet cells; however, the function of islet TSPAN-7 has not been identified. Here, we characterize how β-cell TSPAN-7 regulates Ca²⁺ handling and hormone secretion. We find that TSPAN-7 reduces β-cell glucose-stimulated Ca²⁺ entry, slows Ca²⁺ oscillation frequency and decreases glucose-stimulated insulin secretion. TSPAN-7 controls β-cell function through a direct interaction with L-type voltage-dependent Ca²⁺ channels (Ca_V 1.2 and Ca_V 1.3), which reduces channel Ca²⁺ conductance. TSPAN-7 slows activation of Ca_V 1.2 and accelerates recovery from voltage-dependent inactivation; TSPAN-7 also slows Ca_V 1.3 inactivation kinetics. These findings strongly implicate TSPAN-7 as a key regulator in determining the set-point of glucose-stimulated Ca²⁺ influx and insulin secretion. Abstract: Glucose-stimulated insulin secretion (GSIS) is regulated by calcium (Ca²⁺ ) entry into pancreatic β-cells through voltage-dependent Ca²⁺ (Ca_V ) channels. Tetraspanin (TSPAN) transmembrane proteins control Ca²⁺ handling, and thus they may also modulate GSIS. TSPAN-7 is the most abundant islet TSPAN and immunostaining of mouse and human pancreatic slices shows that TSPAN-7 is highly expressed in β- and α-cells; however, the function of islet TSPAN-7 has not been determined. Here, we show that TSPAN-7 knockdown (KD) increases glucose-stimulated Ca²⁺ influx into mouse and human β-cells. Additionally, mouse β-cell Ca²⁺ oscillation frequency was accelerated by TSPAN-7 KD. Because TSPAN-7 KD also enhanced Ca²⁺ entry when membrane potential was clamped with depolarization, the effect of TSPAN-7 on Ca_V channel activity was examined. TSPAN-7 KD enhanced L-type Ca_V currents in mouse and human β-cells. Conversely, heterologous expression of TSPAN-7 with Ca_V 1.2 and Ca_V 1.3 L-type Ca_V channels decreased Ca_V currents and reduced Ca²⁺ influx through both channels. This was presumably the result of a direct interaction of TSPAN-7 and L-type Ca_V channels because TSPAN-7 coimmunoprecipitated with both Ca_V 1.2 and Ca_V 1.3 from primary human β-cells and from a heterologous expression system. Finally, TSPAN-7 KD in human β-cells increased basal (5.6 mM glucose) and stimulated (45 mM KCl + 14 mM glucose) insulin secretion. These findings strongly suggest that TSPAN-7 modulation of β-cell L-type Ca_V channels is a key determinant of β-cell glucose-stimulated Ca²⁺ entry and thus the set-point of GSIS.

2020-12-01

Effect of MALDI matrices on lipid analyses of biological tissues using MALDI-2 postionization mass spectrometry

McMillen JC, Fincher JA, Klein DR, Spraggins JM, Caprioli RM

TMC-Vanderbilt (Kidney)

Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) allows for highly multiplexed, untargeted detection of many hundreds of analytes from tissue. Recently, laser postionization (MALDI-2) has been developed for increased ion yield and sensitivity for lipid IMS. However, the dependence of MALDI-2 performance on the various lipid classes is largely unknown. To understand the effect of the applied matrix on MALDI-2 analysis of lipids, samples including an equimolar lipid standard mixture, various tissue homogenates, and intact rat kidney tissue sections were analyzed using the following matrices: α-cyano-4-hydroxycinnamic acid, 2',5'-dihydroxyacetophenone, 2',5'-dihydroxybenzoic acid (DHB), and norharmane (NOR). Lipid signal enhancement of protonated species using MALDI-2 technology varied based on the matrix used. Although signal improvements were observed for all matrices, the most dramatic effects using MALDI-2 were observed using NOR and DHB. For lipid standards analyzed by MALDI-2, NOR provided the broadest coverage, enabling the detection of all 13 protonated standards, including nonpolar lipids, whereas DHB gave less coverage but gave the highest signal increase for those lipids recorded. With respect to tissue homogenates and rat kidney tissue, mass spectra were compared and showed that the number and intensity of neutral lipids tentatively identified with MALDI-2 using NOR increased significantly (e.g., fivefold intensity increase for triacylglycerol). In the cases of DHB with MALDI-2, the number of protonated lipids identified from tissue homogenates doubled with 152 on average compared with 76 with MALDI alone. High spatial resolution imaging (~20 μm) of rat kidney tissue showed similar results using DHB with 125 lipids tentatively identified from MALDI-2 spectra versus just 72 using standard MALDI. From the four matrices tested, NOR provided the greatest increase in sensitivity for neutral lipids (triacylglycerol, diacylglycerol, monoacylglycerol, and cholesterol ester), and DHB provided the highest overall number of lipids detected using MALDI-2 technology.

2020-12-01

Integrating ion mobility and imaging mass spectrometry for comprehensive analysis of biological tissues: A brief review and perspective

Rivera ES, Djambazova KV, Neumann EK, Caprioli RM, Spraggins JM

TMC-Vanderbilt (Kidney)

Imaging mass spectrometry (IMS) technologies are capable of mapping a wide array of biomolecules in diverse cellular and tissue environments. IMS has emerged as an essential tool for providing spatially targeted molecular information due to its high sensitivity, wide molecular coverage, and chemical specificity. One of the major challenges for mapping the complex cellular milieu is the presence of many isomers and isobars in these samples. This challenge is traditionally addressed using orthogonal liquid chromatography (LC)-based analysis, though, common approaches such as chromatography and electrophoresis are not able to be performed at timescales that are compatible with most imaging applications. Ion mobility offers rapid, gas-phase separations that are readily integrated with IMS workflows in order to provide additional data dimensionality that can improve signal-to-noise, dynamic range, and specificity. Here, we highlight recent examples of ion mobility coupled to IMS and highlight their importance to the field.

2020-12-01

Progenitor identification and SARS-CoV-2 infection in human distal lung organoids

Salahudeen AA, Choi SS, Rustagi A, Zhu J, van Unen V, de la O SM, Flynn RA, Margalef-Català M, Santos AJM, Ju J, Batish A, Usui T, Zheng GXY, Edwards CE, Wagar LE, Luca V, Anchang B, Nagendran M, Nguyen K, Hart DJ, Terry JM, Belgrader P, Ziraldo SB, Mikkelsen TS, Harbury PB, Glenn JS, Garcia KC, Davis MM, Baric RS, Sabatti C, Amieva MR, Blish CA, Desai TJ, Kuo CJ.

TTD-Stanford

The distal lung contains terminal bronchioles and alveoli that facilitate gas exchange. Three-dimensional in vitro human distal lung culture systems would strongly facilitate the investigation of pathologies such as interstitial lung disease, cancer and coronavirus disease 2019 (COVID-19) pneumonia caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Here we describe the development of a long-term feeder-free, chemically defined culture system for distal lung progenitors as organoids derived from single adult human alveolar epithelial type II (AT2) or KRT5⁺ basal cells. AT2 organoids were able to differentiate into AT1 cells, and basal cell organoids developed lumens lined with differentiated club and ciliated cells. Single-cell analysis of KRT5⁺ cells in basal organoids revealed a distinct population of ITGA6⁺ITGB4⁺ mitotic cells, whose offspring further segregated into a TNFRSF12A^hi subfraction that comprised about ten per cent of KRT5⁺ basal cells. This subpopulation formed clusters within terminal bronchioles and exhibited enriched clonogenic organoid growth activity. We created distal lung organoids with apical-out polarity to present ACE2 on the exposed external surface, facilitating infection of AT2 and basal cultures with SARS-CoV-2 and identifying club cells as a target population. This long-term, feeder-free culture of human distal lung organoids, coupled with single-cell analysis, identifies functional heterogeneity among basal cells and establishes a facile in vitro organoid model of human distal lung infections, including COVID-19-associated pneumonia.

2020-12-02

Lipid Landscape of the Human Retina and Supporting Tissues Revealed by High-Resolution Imaging Mass Spectrometry

Anderson DMG, Messinger JD, Patterson NH, Rivera ES, Kotnala A, Spraggins JM, Caprioli RM, Curcio CA, Schey KL.

TMC-Vanderbilt (Eye/pancreas)

The human retina provides vision at light levels ranging from starlight to sunlight. Its supporting tissues regulate plasma-delivered lipophilic essentials for vision, including retinoids. The macula is an anatomic specialization for high-acuity and color vision that is also vulnerable to prevalent blinding diseases. The retina's exquisite architecture comprises numerous cell types that are aligned horizontally, yielding structurally distinct cell, synaptic, and vascular layers that are visible in histology and in diagnostic clinical imaging. MALDI imaging mass spectrometry (IMS) is now capable of uniting low micrometer spatial resolution with high levels of chemical specificity. In this study, a multimodal imaging approach fortified with accurate multi-image registration was used to localize lipids in human retina tissue at laminar, cellular, and subcellular levels. Multimodal imaging results indicate differences in distributions and abundances of lipid species across and within single cell types. Of note are distinct localizations of signals within specific layers of the macula. For example, phosphatidylethanolamine and phosphatidylinositol lipids were localized to central RPE cells, whereas specific plasmalogen lipids were localized to cells of the perifoveal RPE and Henle fiber layer. Subcellular compartments of photoreceptors were distinguished by PE(20:0_22:5) in the outer nuclear layer, PE(18:0_22:6) in outer and inner segments, and cardiolipin CL(70:5) in the mitochondria-rich inner segments. Several lipids, differing by a single double bond, have markedly different distributions between the central fovea and the ganglion cell and inner nuclear layers. A lipid atlas, initiated in this study, can serve as a reference database for future examination of diseased tissues.

2020-12-02

Multimodal Imaging Mass Spectrometry: Next Generation Molecular Mapping in Biology and Medicine

Neumann EK, Djambazova KV, Caprioli RM, Spraggins JM

TMC-Vanderbilt (Kidney)

Imaging mass spectrometry has become a mature molecular mapping technology that is used for molecular discovery in many medical and biological systems. While powerful by itself, imaging mass spectrometry can be complemented by the addition of other orthogonal, chemically informative imaging technologies to maximize the information gained from a single experiment and enable deeper understanding of biological processes. Within this review, we describe MALDI, SIMS, and DESI imaging mass spectrometric technologies and how these have been integrated with other analytical modalities such as microscopy, transcriptomics, spectroscopy, and electrochemistry in a field termed multimodal imaging. We explore the future of this field and discuss forthcoming developments that will bring new insights to help unravel the molecular complexities of biological systems, from single cells to functional tissue structures and organs.

2020-12-10

GCNG: graph convolutional networks for inferring gene interaction from spatial transcriptomics data

Yuan Y, Bar-Joseph Z

HIVE TC-CMU

Most methods for inferring gene-gene interactions from expression data focus on intracellular interactions. The availability of high-throughput spatial expression data opens the door to methods that can infer such interactions both within and between cells. To achieve this, we developed Graph Convolutional Neural networks for Genes (GCNG). GCNG encodes the spatial information as a graph and combines it with expression data using supervised training. GCNG improves upon prior methods used to analyze spatial transcriptomics data and can propose novel pairs of extracellular interacting genes. The output of GCNG can also be used for downstream analysis including functional gene assignment.Supporting website with software and data: https://github.com/xiaoyeye/GCNG .

2020-12-10

High-Spatial-Resolution Multi-Omics Sequencing via Deterministic Barcoding in Tissue

Liu Y, Yang M, Deng Y, Su G, Enninful A, Guo CC, Tebaldi T, Zhang D, Kim D, Bai Z, Norris E, Pan A, Li J, Xiao Y, Halene S, Fan R

TTD-Yale

We present deterministic barcoding in tissue for spatial omics sequencing (DBiT-seq) for co-mapping of mRNAs and proteins in a formaldehyde-fixed tissue slide via next-generation sequencing (NGS). Parallel microfluidic channels were used to deliver DNA barcodes to the surface of a tissue slide, and crossflow of two sets of barcodes, A1-50 and B1-50, followed by ligation in situ, yielded a 2D mosaic of tissue pixels, each containing a unique full barcode AB. Application to mouse embryos revealed major tissue types in early organogenesis as well as fine features like microvasculature in a brain and pigmented epithelium in an eye field. Gene expression profiles in 10-μm pixels conformed into the clusters of single-cell transcriptomes, allowing for rapid identification of cell types and spatial distributions. DBiT-seq can be adopted by researchers with no experience in microfluidics and may find applications in a range of fields including developmental biology, cancer biology, neuroscience, and clinical pathology.

2020-12-18

RIPK3-mediated inflammation is a conserved β cell response to ER stress

Yang B, Maddison LA, Zaborska KE, Dai C, Yin L, Tang Z, Zang L, Jacobson DA, Powers AC, Chen W

TMC-Vanderbilt (Eye/pancreas)

Islet inflammation is an important etiopathology of type 2 diabetes; however, the underlying mechanisms are not well defined. Using complementary experimental models, we discovered RIPK3-dependent IL1B induction in β cells as an instigator of islet inflammation. In cultured β cells, ER stress activated RIPK3, leading to NF-kB-mediated proinflammatory gene expression. In a zebrafish muscle insulin resistance model, overnutrition caused islet inflammation, β cell dysfunction, and loss in an ER stress-, ripk3-, and il1b-dependent manner. In mouse islets, high-fat diet triggered the IL1B expression in β cells before macrophage recruitment in vivo, and RIPK3 inhibition suppressed palmitate-induced β cell dysfunction and Il1b expression in vitro. Furthermore, in human islets grafted in hyperglycemic mice, a marked increase in ER stress, RIPK3, and NF-kB activation in β cells were accompanied with murine macrophage infiltration. Thus, RIPK3-mediated induction of proinflammatory mediators is a conserved, previously unrecognized β cell response to metabolic stress and a mediator of the ensuing islet inflammation.

2021-01-01

A multimodal and integrated approach to interrogate human kidney biopsies with rigor and reproducibility: guidelines from the Kidney Precision Medicine Project

El-Achkar TM, Eadon MT, Menon R, Lake BB, Sigdel TK, Alexandrov T, Parikh S, Zhang G, Dobi D, Dunn KW, Otto EA, Anderton CR, Carson JM, Luo J, Park C, Hamidi H, Zhou J, Hoover P, Schroeder A, Joanes M, Azeloglu EU, Sealfon R, Winfree S, Steck B, He Y, D'Agati V, Iyengar R, Troyanskaya OG, Barisoni L, Gaut J, Zhang K, Laszik Z, Rovin BH, Dagher PC, Sharma K, Sarwal MM, Hodgin JB, Alpers CE, Kretzler M, Jain S

TMC-UCSD

Comprehensive and spatially mapped molecular atlases of organs at a cellular level are a critical resource to gain insights into pathogenic mechanisms and personalized therapies for diseases. The Kidney Precision Medicine Project (KPMP) is an endeavor to generate three-dimensional (3-D) molecular atlases of healthy and diseased kidney biopsies by using multiple state-of-the-art omics and imaging technologies across several institutions. Obtaining rigorous and reproducible results from disparate methods and at different sites to interrogate biomolecules at a single-cell level or in 3-D space is a significant challenge that can be a futile exercise if not well controlled. We describe a "follow the tissue" pipeline for generating a reliable and authentic single-cell/region 3-D molecular atlas of human adult kidney. Our approach emphasizes quality assurance, quality control, validation, and harmonization across different omics and imaging technologies from sample procurement, processing, storage, shipping to data generation, analysis, and sharing. We established benchmarks for quality control, rigor, reproducibility, and feasibility across multiple technologies through a pilot experiment using common source tissue that was processed and analyzed at different institutions and different technologies. A peer review system was established to critically review quality control measures and the reproducibility of data generated by each technology before their being approved to interrogate clinical biopsy specimens. The process established economizes the use of valuable biopsy tissue for multiomics and imaging analysis with stringent quality control to ensure rigor and reproducibility of results and serves as a model for precision medicine projects across laboratories, institutions and consortia.

2021-01-18

Deep Learning Approach for Dynamic Sparse Sampling for High-Throughput Mass Spectrometry Imaging

Helminiak D, Hu H, Laskin J, Ye DH

TTD-Purdue

A Supervised Learning Approach for Dynamic Sampling (SLADS) addresses traditional issues with the incorporation of stochastic processes into a compressed sensing method. Statistical features, extracted from a sample reconstruction, estimate entropy reduction with regression models, in order to dynamically determine optimal sampling locations. This work introduces an enhanced SLADS method, in the form of a Deep Learning Approach for Dynamic Sampling (DLADS), showing reductions in sample acquisition times for high-fidelity reconstructions between ~ 70-80% over traditional rectilinear scanning. These improvements are demonstrated for dimensionally asymmetric, high-resolution molecular images of mouse uterine and kidney tissues, as obtained using Nanospray Desorption ElectroSpray Ionization (nano-DESI) Mass Spectrometry Imaging (MSI). The methodology for training set creation is adjusted to mitigate stretching artifacts generated when using prior SLADS approaches. Transitioning to DLADS removes the need for feature extraction, further advanced with the employment of convolutional layers to leverage inter-pixel spatial relationships. Additionally, DLADS demonstrates effective generalization, despite dissimilar training and testing data. Overall, DLADS is shown to maximize potential experimental throughput for nano-DESI MSI.

2021-01-27

Integrated spatial genomics reveals global architecture of single nuclei

Takei Y, Yun J, Zheng S, Ollikainen N, Pierson N, White J, Shah S, Thomassie J, Suo S, Eng CL, Guttman M, Yuan GC, Cai L.

TTD-Cal Tech

Identifying the relationships between chromosome structures, nuclear bodies, chromatin states and gene expression is an overarching goal of nuclear-organization studies^1-4. Because individual cells appear to be highly variable at all these levels⁵, it is essential to map different modalities in the same cells. Here we report the imaging of 3,660 chromosomal loci in single mouse embryonic stem (ES) cells using DNA seqFISH+, along with 17 chromatin marks and subnuclear structures by sequential immunofluorescence and the expression profile of 70 RNAs. Many loci were invariably associated with immunofluorescence marks in single mouse ES cells. These loci form 'fixed points' in the nuclear organizations of single cells and often appear on the surfaces of nuclear bodies and zones defined by combinatorial chromatin marks. Furthermore, highly expressed genes appear to be pre-positioned to active nuclear zones, independent of bursting dynamics in single cells. Our analysis also uncovered several distinct mouse ES cell subpopulations with characteristic combinatorial chromatin states. Using clonal analysis, we show that the global levels of some chromatin marks, such as H3 trimethylation at lysine 27 (H3K27me3) and macroH2A1 (mH2A1), are heritable over at least 3-4 generations, whereas other marks fluctuate on a faster time scale. This seqFISH+-based spatial multimodal approach can be used to explore nuclear organization and cell states in diverse biological systems.

2021-01-28

A Generic Framework and Library for Exploration of Small Multiples through Interactive Piling

Lekschas F, Zhou X, Chen W, Gehlenborg N, Bach B, Pfister H

HIVE TC-Harvard

Small multiples are miniature representations of visual information used generically across many domains. Handling large numbers of small multiples imposes challenges on many analytic tasks like inspection, comparison, navigation, or annotation. To address these challenges, we developed a framework and implemented a library called PILlNG.JS for designing interactive piling interfaces. Based on the piling metaphor, such interfaces afford flexible organization, exploration, and comparison of large numbers of small multiples by interactively aggregating visual objects into piles. Based on a systematic analysis of previous work, we present a structured design space to guide the design of visual piling interfaces. To enable designers to efficiently build their own visual piling interfaces, PILlNG.JS provides a declarative interface to avoid having to write low-level code and implements common aspects of the design space. An accompanying GUI additionally supports the dynamic configuration of the piling interface. We demonstrate the expressiveness of PILlNG.JS with examples from machine learning, immunofluorescence microscopy, genomics, and public health.

2021-01-31

Predictive modeling of single-cell DNA methylome data enhances integration with transcriptome data

Uzun Y, Wu H, Tan K.

TMC-CHOP

Single-cell DNA methylation data has become increasingly abundant and has uncovered many genes with a positive correlation between expression and promoter methylation, challenging the common dogma based on bulk data. However, computational tools for analyzing single-cell methylome data are lagging far behind. A number of tasks, including cell type calling and integration with transcriptome data, requires the construction of a robust gene activity matrix as the prerequisite but challenging task. The advent of multi-omics data enables measurement of both DNA methylation and gene expression for the same single cells. Although such data is rather sparse, they are sufficient to train supervised models that capture the complex relationship between DNA methylation and gene expression and predict gene activities at single-cell level. Here, we present methylome association by predictive linkage to expression (MAPLE), a computational framework that learns the association between DNA methylation and expression using both gene- and cell-dependent statistical features. Using multiple data sets generated with different experimental protocols, we show that using predicted gene activity values significantly improves several analysis tasks, including clustering, cell type identification, and integration with transcriptome data. Application of MAPLE revealed several interesting biological insights into the relationship between methylation and gene expression, including asymmetric importance of methylation signals around transcription start site for predicting gene expression, and increased predictive power of methylation signals in promoters located outside CpG islands and shores. With the rapid accumulation of single-cell epigenomics data, MAPLE provides a general framework for integrating such data with transcriptome data.

2021-02-15

Construction of a Multi-Phase Contrast Computed Tomography Kidney Atlas

Lee HH, Tang Y, Xu K, Bao S, Fogo AB, Harris R, de Caestecker MP, Heinrich M, Spraggins JM, Huo Y, Landman BA

TMC-Vanderbilt (Kidney)

The Human BioMolecular Atlas Program (HuBMAP) seeks to create a molecular atlas at the cellular level of the human body to spur interdisciplinary innovations across spatial and temporal scales. While the preponderance of effort is allocated towards cellular and molecular scale mapping, differentiating and contextualizing findings within tissues, organs and systems are essential for the HuBMAP efforts. The kidney is an initial organ target of HuBMAP, and constructing a framework (or atlas) for integrating information across scales is needed for visualizing and integrating information. However, there is no abdominal atlas currently available in the public domain. Substantial variation in healthy kidneys exists with sex, body size, and imaging protocols. With the integration of clinical archives for secondary research use, we are able to build atlases based on a diverse population and clinically relevant protocols. In this study, we created a computed tomography (CT) phase-specific atlas for the abdomen, which is optimized for the kidney organ. A two-stage registration pipeline was used by registering extracted abdominal volume of interest from body part regression, to a high-resolution CT. Affine and non-rigid registration were performed to all scans hierarchically. To generate and evaluate the atlas, multiphase CT scans of 500 control subjects (age: 15 - 50, 250 males, 250 females) are registered to the atlas target through the complete pipeline. The abdominal body and kidney registration are shown to be stable with the variance map computed from the result average template. Both left and right kidneys are substantially localized in the high-resolution target space, which successfully demonstrated the sharp details of its anatomical characteristics across each phase. We illustrated the applicability of the atlas template for integrating across normal kidney variation from 64 cm³ to 302 cm³.

2021-02-15

Renal Cortex, Medulla and Pelvicaliceal System Segmentation on Arterial Phase CT Images with Random Patch-based Networks

Tang Y, Gao R, Lee HH, Xu Z, Savoie BV, Bao S, Huo Y, Fogo AB, Harris R, de Caestecker MP, Spraggins J, Landman BA

TMC-Vanderbilt (Kidney)

Renal segmentation on contrast-enhanced computed tomography (CT) provides distinct spatial context and morphology. Current studies for renal segmentations are highly dependent on manual efforts, which are time-consuming and tedious. Hence, developing an automatic framework for the segmentation of renal cortex, medulla and pelvicalyceal system is an important quantitative assessment of renal morphometry. Recent innovations in deep methods have driven performance toward levels for which clinical translation is appealing. However, the segmentation of renal structures can be challenging due to the limited field-of-view (FOV) and variability among patients. In this paper, we propose a method to automatically label the renal cortex, the medulla and pelvicalyceal system. First, we retrieved 45 clinically-acquired deidentified arterial phase CT scans (45 patients, 90 kidneys) without diagnosis codes (ICD-9) involving kidney abnormalities. Second, an interpreter performed manual segmentation to pelvis, medulla and cortex slice-by-slice on all retrieved subjects under expert supervision. Finally, we proposed a patch-based deep neural networks to automatically segment renal structures. Compared to the automatic baseline algorithm (3D U-Net) and conventional hierarchical method (3D U-Net Hierarchy), our proposed method achieves improvement of 0.7968 to 0.6749 (3D U-Net), 0.7482 (3D U-Net Hierarchy) in terms of mean Dice scores across three classes (p-value < 0.001, paired t-tests between our method and 3D U-Net Hierarchy). In summary, the proposed algorithm provides a precise and efficient method for labeling renal structures.

2021-02-23

Spatial Segmentation of Mass Spectrometry Imaging Data by Combining Multivariate Clustering and Univariate Thresholding.

Hu H, Yin R, Brown HM, Laskin J

TTD-Purdue

Spatial segmentation partitions mass spectrometry imaging (MSI) data into distinct regions, providing a concise visualization of the vast amount of data and identifying regions of interest (ROIs) for downstream statistical analysis. Unsupervised approaches are particularly attractive, as they may be used to discover the underlying subpopulations present in the high-dimensional MSI data without prior knowledge of the properties of the sample. Herein, we introduce an unsupervised spatial segmentation approach, which combines multivariate clustering and univariate thresholding to generate comprehensive spatial segmentation maps of the MSI data. This approach combines matrix factorization and manifold learning to enable high-quality image segmentation without an extensive hyperparameter search. In parallel, some ion images inadequately represented in the multivariate analysis were treated using univariate thresholding to generate complementary spatial segments. The final spatial segmentation map was assembled from segment candidates that were generated using both techniques. We demonstrate the performance and robustness of this approach for two MSI data sets of mouse uterine and kidney tissue sections that were acquired with different spatial resolutions. The resulting segmentation maps are easy to interpret and project onto the known anatomical regions of the tissue.

2021-03-01

Surfactant-assisted one-pot sample preparation for label-free single-cell proteomics

Tsai, CF., Zhang, P., Scholten, D. et al.

TTD-PNNL/Northwestern

Large numbers of cells are generally required for quantitative global proteome profiling due to surface adsorption losses associated with sample processing. Such bulk measurement obscures important cell-to-cell variability (cell heterogeneity) and makes proteomic profiling impossible for rare cell populations (e.g., circulating tumor cells (CTCs)). Here we report a surfactant-assisted one-pot sample preparation coupled with mass spectrometry (MS) method termed SOP-MS for label-free global single-cell proteomics. SOP-MS capitalizes on the combination of a MS-compatible nonionic surfactant, n-Dodecyl-β-D-maltoside, and hydrophobic surface-based low-bind tubes or multi-well plates for ‘all-in-one’ one-pot sample preparation. This ‘all-in-one’ method including elimination of all sample transfer steps maximally reduces surface adsorption losses for effective processing of single cells, thus improving detection sensitivity for single-cell proteomics. This method allows convenient label-free quantification of hundreds of proteins from single human cells and ~1200 proteins from small tissue sections (close to ~20 cells). When applied to a patient CTC-derived xenograft (PCDX) model at the single-cell resolution, SOP-MS can reveal distinct protein signatures between primary tumor cells and early metastatic lung cells, which are related to the selection pressure of anti-tumor immunity during breast cancer metastasis. The approach paves the way for routine, precise, quantitative single-cell proteomics.

2021-03-08

Giotto: a toolbox for integrative analysis and visualization of spatial expression data

Dries R, Zhu Q, Dong R, Eng CL, Li H, Liu K, Fu Y, Zhao T, Sarkar A, Bao F, George RE, Pierson N, Cai L, Yuan GC.

TTD-Cal Tech

Spatial transcriptomic and proteomic technologies have provided new opportunities to investigate cells in their native microenvironment. Here we present Giotto, a comprehensive and open-source toolbox for spatial data analysis and visualization. The analysis module provides end-to-end analysis by implementing a wide range of algorithms for characterizing tissue composition, spatial expression patterns, and cellular interactions. Furthermore, single-cell RNAseq data can be integrated for spatial cell-type enrichment analysis. The visualization module allows users to interactively visualize analysis outputs and imaging features. To demonstrate its general applicability, we apply Giotto to a wide range of datasets encompassing diverse technologies and platforms.

2021-03-16

Phase identification for dynamic CT enhancements with generative adversarial network

Tang Y, Gao R, Lee HH, Chen Y, Gao D, Bermudez C, Bao S, Huo Y, Savoie BV, Landman BA

TMC-Vanderbilt (Kidney)

Purpose: Dynamic contrast-enhanced computed tomography (CT) is widely used to provide dynamic tissue contrast for diagnostic investigation and vascular identification. However, the phase information of contrast injection is typically recorded manually by technicians, which introduces missing or mislabeling. Hence, imaging-based contrast phase identification is appealing, but challenging, due to large variations among different contrast protocols, vascular dynamics, and metabolism, especially for clinically acquired CT scans. The purpose of this study is to perform imaging-based phase identification for dynamic abdominal CT using a proposed adversarial learning framework across five representative contrast phases. Methods: A generative adversarial network (GAN) is proposed as a disentangled representation learning model. To explicitly model different contrast phases, a low dimensional common representation and a class specific code are fused in the hidden layer. Then, the low dimensional features are reconstructed following a discriminator and classifier. 36 350 slices of CT scans from 400 subjects are used to evaluate the proposed method with fivefold cross-validation with splits on subjects. Then, 2216 slices images from 20 independent subjects are employed as independent testing data, which are evaluated using multiclass normalized confusion matrix. Results: The proposed network significantly improved correspondence (0.93) over VGG, ResNet50, StarGAN, and 3DSE with accuracy scores 0.59, 0.62, 0.72, and 0.90, respectively (P < 0.001 Stuart-Maxwell test for normalized multiclass confusion matrix). Conclusion: We show that adversarial learning for discriminator can be benefit for capturing contrast information among phases. The proposed discriminator from the disentangled network achieves promising results.

2021-03-22

Islet sympathetic innervation and islet neuropathology in patients with type 1 diabetes

Campbell-Thompson M, Butterworth EA, Boatwright JL, Nair MA, Nasif LH, Nasif K, Revell AY, Riva A, Mathews CE, Gerling IC, Schatz DA, Atkinson MA

TMC-PNNL

Dysregulation of glucagon secretion in type 1 diabetes (T1D) involves hypersecretion during postprandial states, but insufficient secretion during hypoglycemia. The sympathetic nervous system regulates glucagon secretion. To investigate islet sympathetic innervation in T1D, sympathetic tyrosine hydroxylase (TH) axons were analyzed in control non-diabetic organ donors, non-diabetic islet autoantibody-positive individuals (AAb), and age-matched persons with T1D. Islet TH axon numbers and density were significantly decreased in AAb compared to T1D with no significant differences observed in exocrine TH axon volume or lengths between groups. TH axons were in close approximation to islet α-cells in T1D individuals with long-standing diabetes. Islet RNA-sequencing and qRT-PCR analyses identified significant alterations in noradrenalin degradation, α-adrenergic signaling, cardiac β-adrenergic signaling, catecholamine biosynthesis, and additional neuropathology pathways. The close approximation of TH axons at islet α-cells supports a model for sympathetic efferent neurons directly regulating glucagon secretion. Sympathetic islet innervation and intrinsic adrenergic signaling pathways could be novel targets for improving glucagon secretion in T1D.

2021-04-14

CytoTalk: De novo construction of signal transduction networks using single-cell transcriptomic data

Yuxuan H, Tao P, Lin G, Kai T

TMC-CHOP

Single-cell technology enables study of signal transduction in a complex tissue at unprecedented resolution. We describe CytoTalk for de novo construction of cell type–specific signaling networks using single-cell transcriptomic data. Using an integrated intracellular and intercellular gene network as the input, CytoTalk identifies candidate pathways using the prize-collecting Steiner forest algorithm. Using high-throughput spatial transcriptomic data and single-cell RNA sequencing data with receptor gene perturbation, we demonstrate that CytoTalk has substantial improvement over existing algorithms. To better understand plasticity of signaling networks across tissues and developmental stages, we perform a comparative analysis of signaling networks between macrophages and endothelial cells across human adult and fetal tissues. Our analysis reveals an overall increased plasticity of signaling networks across adult tissues and specific network nodes that contribute to increased plasticity. CytoTalk enables de novo construction of signal transduction pathways and facilitates comparative analysis of these pathways across tissues and conditions.

2021-04-20

Quantitative Mass Spectrometry Imaging of Biological Systems

Unsihuay D, Mesa Sanchez D, Laskin J.

TTD-Purdue

Mass spectrometry imaging (MSI) is a powerful, label-free technique that provides detailed maps of hundreds of molecules in complex samples with high sensitivity and subcellular spatial resolution. Accurate quantification in MSI relies on a detailed understanding of matrix effects associated with the ionization process along with evaluation of the extraction efficiency and mass-dependent ion losses occurring in the analysis step. We present a critical summary of approaches developed for quantitative MSI of metabolites, lipids, and proteins in biological tissues and discuss their current and future applications.

2021-04-26

Pancreas Optical Clearing and 3-D Microscopy in Health and Diabetes

Campbell-Thompson M, Tang SC

TMC-PNNL

Although first described over a hundred years ago, tissue optical clearing is undergoing renewed interest due to numerous advances in optical clearing methods, microscopy systems, and three-dimensional (3-D) image analysis programs. These advances are advantageous for intact mouse tissues or pieces of human tissues because samples sized several millimeters can be studied. Optical clearing methods are particularly useful for studies of the neuroanatomy of the central and peripheral nervous systems and tissue vasculature or lymphatic system. Using examples from solvent- and aqueous-based optical clearing methods, the mouse and human pancreatic structures and networks will be reviewed in 3-D for neuro-insular complexes, parasympathetic ganglia, and adipocyte infiltration as well as lymphatics in diabetes. Optical clearing with multiplex immunofluorescence microscopy provides new opportunities to examine the role of the nervous and circulatory systems in pancreatic and islet functions by defining their neurovascular anatomy in health and diabetes.

2021-04-27

Deeper Protein Identification Using Field Asymmetric Ion Mobility Spectrometry in Top-Down Proteomics

Gerbasi VR, Melani RD, Abbatiello SE, Belford MW, Huguet R, McGee JP, Dayhoff D, Thomas PM, Kelleher NL

RTI-Northwestern

Field asymmetric ion mobility spectrometry (FAIMS), when used in proteomics studies, provides superior selectivity and enables more proteins to be identified by providing additional gas-phase separation. Here, we tested the performance of cylindrical FAIMS for the identification and characterization of proteoforms by top-down mass spectrometry of heterogeneous protein mixtures. Combining FAIMS with chromatographic separation resulted in a 62% increase in protein identifications, an 8% increase in proteoform identifications, and an improvement in proteoform identification compared to samples analyzed without FAIMS. In addition, utilization of FAIMS resulted in the identification of proteins encoded by lower-abundance mRNA transcripts. These improvements were attributable, in part, to improved signal-to-noise for proteoforms with similar retention times. Additionally, our results show that the optimal compensation voltage of any given proteoform was correlated with the molecular weight of the analyte. Collectively these results suggest that the addition of FAIMS can enhance top-down proteomics in both discovery and targeted applications.

2021-05-01

Highly multiplexed tissue imaging using repeated oligonucleotide exchange reaction

Kennedy-Darling J, Bhate SS, Hickey JW, Black S, Barlow GL, Vazquez G, Venkataraaman VG, Samusik N, Goltsev Y, Schürch CM, Nolan GP

TMC-Stanford

Multiparameter tissue imaging enables analysis of cell-cell interactions in situ, the cellular basis for tissue structure, and novel cell types that are spatially restricted, giving clues to biological mechanisms behind tissue homeostasis and disease. Here, we streamlined and simplified the multiplexed imaging method CO-Detection by indEXing (CODEX) by validating 58 unique oligonucleotide barcodes that can be conjugated to antibodies. We showed that barcoded antibodies retained their specificity for staining cognate targets in human tissue. Antibodies were visualized one at a time by adding a fluorescently labeled oligonucleotide complementary to oligonucleotide barcode, imaging, stripping, and repeating this cycle. With this we developed a panel of 46 antibodies that was used to stain five human lymphoid tissues: three tonsils, a spleen, and a LN. To analyze the data produced, an image processing and analysis pipeline was developed that enabled single-cell analysis on the data, including unsupervised clustering, that revealed 31 cell types across all tissues. We compared cell-type compositions within and directly surrounding follicles from the different lymphoid organs and evaluated cell-cell density correlations. This sequential oligonucleotide exchange technique enables a facile imaging of tissues that leverages pre-existing imaging infrastructure to decrease the barriers to broad use of multiplexed imaging.

2021-05-03

Supervised Adversarial Alignment of Single-Cell RNA-seq Data

Ge S, Wang H, Alavi A, Xing E, Bar-Joseph Z

HIVE TC-CMU

Dimensionality reduction is an important first step in the analysis of single-cell RNA-sequencing (scRNA-seq) data. In addition to enabling the visualization of the profiled cells, such representations are used by many downstream analyses methods ranging from pseudo-time reconstruction to clustering to alignment of scRNA-seq data from different experiments, platforms, and laboratories. Both supervised and unsupervised methods have been proposed to reduce the dimension of scRNA-seq. However, all methods to date are sensitive to batch effects. When batches correlate with cell types, as is often the case, their impact can lead to representations that are batch rather than cell-type specific. To overcome this, we developed a domain adversarial neural network model for learning a reduced dimension representation of scRNA-seq data. The adversarial model tries to simultaneously optimize two objectives. The first is the accuracy of cell-type assignment and the second is the inability to distinguish the batch (domain). We tested the method by using the resulting representation to align several different data sets. As we show, by overcoming batch effects our method was able to correctly separate cell types, improving on several prior methods suggested for this task. Analysis of the top features used by the network indicates that by taking the batch impact into account, the reduced representation is much better able to focus on key genes for each cell type.

2021-05-10

SpatialDWLS: accurate deconvolution of spatial transcriptomic data

Dong R, Yuan GC

TTD-Cal Tech

Recent development of spatial transcriptomic technologies has made it possible to characterize cellular heterogeneity with spatial information. However, the technology often does not have sufficient resolution to distinguish neighboring cell types. Here, we present spatialDWLS, to quantitatively estimate the cell-type composition at each spatial location. We benchmark the performance of spatialDWLS by comparing it with a number of existing deconvolution methods and find that spatialDWLS outperforms the other methods in terms of accuracy and speed. By applying spatialDWLS to a human developmental heart dataset, we observe striking spatial temporal changes of cell-type composition during development.

2021-05-11

Spatial multi-omics sequencing for fixed tissue via DBiT-seq

Su G, Qin X, Enninful A, Bai Z, Deng Y, Liu Y, Fan R

TTD-Yale

This protocol describes the use of the deterministic barcoding in tissue for spatial omics sequencing platform to construct a multi-omics atlas on fixed frozen tissue samples. This approach uses a microfluidic-based method to introduce combinatorial DNA oligo barcodes directly to the cells in a tissue section fixed on a glass slide. This technique does not directly resolve single cells but can achieve a near-single-cell resolution for spatial transcriptomics and spatial analysis of a targeted panel of proteins. For complete details on the use and execution of this protocol, please refer to Liu et al. (2020).

Keywords: G

2021-05-17

Identifying signaling genes in spatial single-cell expression data

Li D, Ding J, Bar-Joseph Z

HIVE TC-CMU

Motivation: Recent technological advances enable the profiling of spatial single-cell expression data. Such data present a unique opportunity to study cell-cell interactions and the signaling genes that mediate them. However, most current methods for the analysis of these data focus on unsupervised descriptive modeling, making it hard to identify key signaling genes and quantitatively assess their impact. Results: We developed a Mixture of Experts for Spatial Signaling genes Identification (MESSI) method to identify active signaling genes within and between cells. The mixture of experts strategy enables MESSI to subdivide cells into subtypes. MESSI relies on multi-task learning using information from neighboring cells to improve the prediction of response genes within a cell. Applying the methods to three spatial single-cell expression datasets, we show that MESSI accurately predicts the levels of response genes, improving upon prior methods and provides useful biological insights about key signaling genes and subtypes of excitatory neuron cells. Availability and implementation: MESSI is available at: https://github.com/doraadong/MESSI. Supplementary information: Supplementary data are available at Bioinformatics online.

2021-05-19

Highly Multiplexed Phenotyping of Immunoregulatory Proteins in the Tumor Microenvironment by CODEX Tissue Imaging

Phillips D, Schürch CM, Khodadoust MS, Kim YH, Nolan GP, Jiang S

TMC-Stanford

Immunotherapies are revolutionizing cancer treatment by boosting the natural ability of the immune system. In addition to antibodies against traditional checkpoint molecules or their ligands (i.e., CTLA-4, PD-1, and PD-L1), therapies targeting molecules such as ICOS, IDO-1, LAG-3, OX40, TIM-3, and VISTA are currently in clinical trials. To better inform clinical care and the design of therapeutic combination strategies, the co-expression of immunoregulatory proteins on individual immune cells within the tumor microenvironment must be robustly characterized. Highly multiplexed tissue imaging platforms, such as CO-Detection by indEXing (CODEX), are primed to meet this need by enabling >50 markers to be simultaneously analyzed in single-cells on formalin-fixed paraffin-embedded (FFPE) tissue sections. Assembly and validation of antibody panels is particularly challenging, with respect to the specificity of antigen detection and robustness of signal over background. Herein, we report the design, development, optimization, and application of a 56-marker CODEX antibody panel to eight cutaneous T cell lymphoma (CTCL) patient samples. This panel is comprised of structural, tumor, and immune cell markers, including eight immunoregulatory proteins that are approved or currently undergoing clinical trials as immunotherapy targets. Here we provide a resource to enable extensive high-dimensional, spatially resolved characterization of the tissue microenvironment across tumor types and imaging modalities. This framework provides researchers with a readily applicable blueprint to study tumor immunology, tissue architecture, and enable mechanistic insights into immunotherapeutic targets.

2021-05-25

RAP-NET: COARSE-TO-FINE MULTI-ORGAN SEGMENTATION WITH SINGLE RANDOM ANATOMICAL PRIOR

Lee HH, Tang Y, Bao S, Abramson RG, Huo Y, Landman BA

TMC-Vanderbilt (Kidney)

Performing coarse-to-fine abdominal multi-organ segmentation facilitates extraction of high-resolution segmentation minimizing the loss of spatial contextual information. However, current coarse-to-refine approaches require a significant number of models to perform single organ segmentation. We propose a coarse-to-fine pipeline RAP-Net, which starts from the extraction of the global prior context of multiple organs from 3D volumes using a low-resolution coarse network, followed by a fine phase that uses a single refined model to segment all abdominal organs instead of multiple organ corresponding models. We combine the anatomical prior with corresponding extracted patches to preserve the anatomical locations and boundary information for performing high-resolution segmentation across all organs in a single model. To train and evaluate our method, a clinical research cohort consisting of 100 patient volumes with 13 organs well-annotated is used. We tested our algorithms with 4-fold cross-validation and computed the Dice score for evaluating the segmentation performance of the 13 organs. Our proposed method using single auto-context outperforms the state-of-the-art on 13 models with an average Dice score 84.58% versus 81.69% (p<0.0001).

2021-05-26

Multiomics Imaging Using High-Energy Water Gas Cluster Ion Beam Secondary Ion Mass Spectrometry [(H 2 O) n-GCIB-SIMS] of Frozen-Hydrated Cells and Tissue

Tian H, Sheraz Née Rabbani S, Vickerman JC, Winograd N

TTD-Columbia/Penn State

Integration of multiomics at the single-cell level allows the unambiguous dissecting of phenotypic heterogeneity at different states such as health, disease, and biomedical response. Imaging mass spectrometry holds the promise of being able to measure multiple types of biomolecules in parallel in the same cell. We have explored the possibility of using water gas cluster ion beam secondary ion mass spectrometry [(H₂O)_n-GCIB-SIMS] as an analytical tool for multiomics assay. (H₂O)_n-GCIB has been hailed as an ideal ionization source for biological sampling owing to the enhanced chemical sensitivity and reduced matrix effect. Taking advantage of 1 μm spatial resolution by using a high-energy beam system, we have clearly shown the enhancement of multiple intact biomolecules up to a few hundredfold in single cells. Coupled with the cryogenic sample preparation/measurement, the lipids and metabolites were imaged simultaneously within the cellular region, uncovering the pristine chemistry for integrated omics in the same sample. We have demonstrated that double-charged myelin protein fragments and single-charged multiple lipids and metabolites can be localized in the same cells/tissue with a single acquisition. Our exploration has also been extended to the capability of (H₂O)_n-GCIB in the generation of multiple charged peptides on protein standards. Frozen hydration combined with (H₂O)_n-GCIB provides the possibility of universal enhancement for the ionization of multiple bio-molecules, including peptides/proteins which has allowed "omics" to become feasible in the same sample using SIMS.

2021-05-31

Body Part Regression With Self-Supervision

Tang Y, Gao R, Han S, Chen Y, Gao D, Nath V, Bermudez C, Savona MR, Bao S, Lyu I, Huo Y, Landman BA

TMC-Vanderbilt (Eye/pancreas)

Body part regression is a promising new technique that enables content navigation through self-supervised learning. Using this technique, the global quantitative spatial location for each axial view slice is obtained from computed tomography (CT). However, it is challenging to define a unified global coordinate system for body CT scans due to the large variabilities in image resolution, contrasts, sequences, and patient anatomy. Therefore, the widely used supervised learning approach cannot be easily deployed. To address these concerns, we propose an annotation-free method named blind-unsupervised-supervision network (BUSN). The contributions of the work are in four folds: (1) 1030 multi-center CT scans are used in developing BUSN without any manual annotation. (2) the proposed BUSN corrects the predictions from unsupervised learning and uses the corrected results as the new supervision; (3) to improve the consistency of predictions, we propose a novel neighbor message passing (NMP) scheme that is integrated with BUSN as a statistical learning based correction; and (4) we introduce a new pre-processing pipeline with inclusion of the BUSN, which is validated on 3D multi-organ segmentation. The proposed method is trained on 1,030 whole body CT scans (230,650 slices) from five datasets, as well as an independent external validation cohort with 100 scans. From the body part regression results, the proposed BUSN achieved significantly higher median R-squared score (=0.9089) than the state-of-the-art unsupervised method (=0.7153). When introducing BUSN as a preprocessing stage in volumetric segmentation, the proposed pre-processing pipeline using BUSN approach increases the total mean Dice score of the 3D abdominal multi-organ segmentation from 0.7991 to 0.8145.

2021-06-07

The emerging landscape of single-molecule protein sequencing technologies.

Alfaro JA, Bohländer P, Dai M, Filius M, Howard CJ, van Kooten XF, Ohayon S, Pomorski A, Schmid S, Aksimentiev A, Anslyn EV, Bedran G, Cao C, Chinappi M, Coyaud E, Dekker C, Dittmar G, Drachman N, Eelkema R, Goodlett D, Hentz S, Kalathiya U, Kelleher NL, Kelly RT, Kelman Z, Kim SH, Kuster B, Rodriguez-Larrea D, Lindsay S, Maglia G, Marcotte EM, Marino JP, Masselon C, Mayer M, Samaras P, Sarthak K, Sepiashvili L, Stein D, Wanunu M, Wilhelm M, Yin P, Meller A, Joo C

RTI-Northwestern

Single-cell profiling methods have had a profound impact on the understanding of cellular heterogeneity. While genomes and transcriptomes can be explored at the single-cell level, single-cell profiling of proteomes is not yet established. Here we describe new single-molecule protein sequencing and identification technologies alongside innovations in mass spectrometry that will eventually enable broad sequence coverage in single-cell profiling. These technologies will in turn facilitate biological discovery and open new avenues for ultrasensitive disease diagnostics.

2021-06-15

Successive High-Resolution (H(2)O)(n)-GCIB and C(60)-SIMS Imaging Integrates Multi-Omics in Different Cell Types in Breast Cancer Tissue

Tian H, Sparvero LJ, Anthonymuthu TS, Sun WY, Amoscato AA, He RR, Bayır H, Kagan VE, Winograd N

TTD-Columbia/Penn State

The temporo-spatial organization of different cells in the tumor microenvironment (TME) is the key to understanding their complex communication networks and the immune landscape that exists within compromised tissues. Multi-omics profiling of single-interacting cells in the native TME is critical for providing further information regarding the reprograming mechanisms leading to immunosuppression and tumor progression. This requires new technologies for biomolecular profiling of phenotypically heterogeneous cells on the same tissue sample. Here, we developed a new methodology for comprehensive lipidomic and metabolomic profiling of individual cells on frozen-hydrated tissue sections using water gas cluster ion beam secondary ion mass spectrometry ((H₂O)_n-GCIB-SIMS) (at 1.6 μm beam spot size), followed by profiling cell-type specific lanthanide antibodies on the same tissue section using C₆₀-SIMS (at 1.1 μm beam spot size). We revealed distinct variations of distribution and intensities of >150 key ions (e.g., lipids and important metabolites) in different types of the TME individual cells, such as actively proliferating tumor cells as well as infiltrating immune cells. The demonstrated feasibility of SIMS imaging to integrate the multi-omics profiling in the same tissue section at the single-cell level will lead to new insights into the role of lipid reprogramming and metabolic response in normal regulation or pathogenic discoordination of cell-cell interactions in a variety of tissue microenvironments.

2021-06-24

Integrated analysis of multimodal single-cell data

Hao Y, Hao S, Andersen-Nissen E, Mauck WM 3rd, Zheng S, Butler A, Lee MJ, Wilk AJ, Darby C, Zager M, Hoffman P, Stoeckius M, Papalexi E, Mimitou EP, Jain J, Srivastava A, Stuart T, Fleming LM, Yeung B, Rogers AJ, McElrath JM, Blish CA, Gottardo R, Smibert P, Satija R

HIVE MC-NYGC

The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce "weighted-nearest neighbor" analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity.

2021-07-02

Embryo-scale, single-cell spatial transcriptomics

Srivatsan SR, Regier MC, Barkan E, Franks JM, Packer JS, Grosjean P, Duran M, Saxton S, Ladd JJ, Spielmann M, Lois C, Lampe PD, Shendure J, Stevens KR, Trapnell C

TMC-Cal Tech

Spatial patterns of gene expression manifest at scales ranging from local (e.g., cell-cell interactions) to global (e.g., body axis patterning). However, current spatial transcriptomics methods either average local contexts or are restricted to limited fields of view. Here, we introduce sci-Space, which retains single-cell resolution while resolving spatial heterogeneity at larger scales. Applying sci-Space to developing mouse embryos, we captured approximate spatial coordinates and whole transcriptomes of about 120,000 nuclei. We identify thousands of genes exhibiting anatomically patterned expression, leverage spatial information to annotate cellular subtypes, show that cell types vary substantially in their extent of spatial patterning, and reveal correlations between pseudotime and the migratory patterns of differentiating neurons. Looking forward, we anticipate that sci-Space will facilitate the construction of spatially resolved single-cell atlases of mammalian development.

2021-07-06

Editorial: Global excellence in inflammatory diseases: North America 2021

Kusner LL, Misra RS, Lucas R

TMC-URMC

2021-07-07

New Interface for Faster Proteoform Analysis: Immunoprecipitation Coupled with SampleStream-Mass Spectrometry

Santos Seckler HD, Park HM, Lloyd-Jones CM, Melani RD, Camarillo JM, Wilkins JT, Compton PD, Kelleher NL

RTI-Northwestern

Different proteoform products of the same gene can exhibit differing associations with health and disease, and their patterns of modifications may offer more precise markers of phenotypic differences between individuals. However, currently employed protein-biomarker discovery and quantification tools, such as bottom-up proteomics and ELISAs, are mostly proteoform-unaware. Moreover, the current throughput for proteoform-level analyses by liquid chromatography mass spectrometry (LCMS) for quantitative top-down proteomics is incompatible with population-level biomarker surveys requiring robust, faster proteoform analysis. To this end, we developed immunoprecipitation coupled to SampleStream mass spectrometry (IP-SampleStream-MS) as a high-throughput, automated technique for the targeted quantification of proteoforms. We applied IP-SampleStream-MS to serum samples of 25 individuals to assess the proteoform abundances of apolipoproteins A-I (ApoA-I) and C-III (ApoC-III). The results for ApoA-I were compared to those of LCMS for these individuals, with IP-SampleStream-MS showing a >7-fold higher throughput with >50% better analytical variation. Proteoform abundances measured by IP-SampleStream-MS correlated strongly to LCMS-based values (R2 = 0.6-0.9) and produced convergent proteoform-to-phenotype associations, namely, the abundance of canonical ApoA-I was associated with lower HDL-C (R = 0.5) and glycated ApoA-I with higher fasting glucose (R = 0.6). We also observed proteoform-to-phenotype associations for ApoC-III, 22 glycoproteoforms of which were characterized in this study. The abundance of ApoC-III modified by a single N-acetyl hexosamine (HexNAc) was associated with indices of obesity, such as BMI, weight, and waist circumference (R ∼ 0.7). These data show IP-SampleStream-MS to be a robust, scalable workflow for high-throughput associations of proteoforms to phenotypes.

2021-07-08

Mass spectrometry-based metabolomics: a guide for annotation, quantification and best reporting practices

Alseekh S, Aharoni A, Brotman Y, Contrepois K, D'Auria J, Ewald J, C Ewald J, Fraser PD, Giavalisco P, Hall RD, Heinemann M, Link H, Luo J, Neumann S, Nielsen J, Perez de Souza L, Saito K, Sauer U, Schroeder FC, Schuster S, Siuzdak G, Skirycz A, Sumner LW, Snyder MP, Tang H, Tohge T, Wang Y, Wen W, Wu S, Xu G, Zamboni N, Fernie AR

TMC-Stanford

Mass spectrometry-based metabolomics approaches can enable detection and quantification of many thousands of metabolite features simultaneously. However, compound identification and reliable quantification are greatly complicated owing to the chemical complexity and dynamic range of the metabolome. Simultaneous quantification of many metabolites within complex mixtures can additionally be complicated by ion suppression, fragmentation and the presence of isomers. Here we present guidelines covering sample preparation, replication and randomization, quantification, recovery and recombination, ion suppression and peak misidentification, as a means to enable high-quality reporting of liquid chromatography- and gas chromatography-mass spectrometry-based metabolomics-derived data.

2021-08-02

CODEX multiplexed tissue imaging with DNA-conjugated antibodies

Black S, Phillips D, Hickey JW, Kennedy-Darling J, Venkataraaman VG, Samusik N, Goltsev Y, Schürch CM, Nolan GP

TMC-Stanford

Advances in multiplexed imaging technologies have drastically improved our ability to characterize healthy and diseased tissues at the single-cell level. Co-detection by indexing (CODEX) relies on DNA-conjugated antibodies and the cyclic addition and removal of complementary fluorescently labeled DNA probes and has been used so far to simultaneously visualize up to 60 markers in situ. CODEX enables a deep view into the single-cell spatial relationships in tissues and is intended to spur discovery in developmental biology, disease and therapeutic design. Herein, we provide optimized protocols for conjugating purified antibodies to DNA oligonucleotides, validating the conjugation by CODEX staining and executing the CODEX multicycle imaging procedure for both formalin-fixed, paraffin-embedded (FFPE) and fresh-frozen tissues. In addition, we describe basic image processing and data analysis procedures. We apply this approach to an FFPE human tonsil multicycle experiment. The hands-on experimental time for antibody conjugation is ~4.5 h, validation of DNA-conjugated antibodies with CODEX staining takes ~6.5 h and preparation for a CODEX multicycle experiment takes ~8 h. The multicycle imaging and data analysis time depends on the tissue size, number of markers in the panel and computational complexity.

2021-08-05

Community-wide hackathons to identify central themes in single-cell multi-omics.

Lê Cao KA, Abadi AJ, Davis-Marcisak EF, Hsu L, Arora A, Coullomb A, Deshpande A, Feng Y, Jeganathan P, Loth M, Meng C, Mu W, Pancaldi V, Sankaran K, Righelli D, Singh A, Sodicoff JS, Stein-O'Brien GL, Subramanian A, Welch JD, You Y, Argelaguet R, Carey VJ, Dries R, Greene CS, Holmes S, Love MI, Ritchie ME, Yuan GC, Culhane AC, Fertig E.

TTD-Cal Tech

2021-08-10

Immunophenotyping assessment in a COVID-19 cohort (IMPACC): A prospective longitudinal study

IMPACC Manuscript Writing Team; IMPACC Network Steering Committee

TMC-Florida

The IMmunoPhenotyping Assessment in a COVID-19 Cohort (IMPACC) is a prospective longitudinal study designed to enroll 1000 hospitalized patients with COVID-19 (NCT04378777). IMPACC collects detailed clinical, laboratory and radiographic data along with longitudinal biologic sampling of blood and respiratory secretions for in depth testing. Clinical and lab data are integrated to identify immunologic, virologic, proteomic, metabolomic and genomic features of COVID-19-related susceptibility, severity and disease progression. The goals of IMPACC are to better understand the contributions of pathogen dynamics and host immune responses to the severity and course of COVID-19 and to generate hypotheses for identification of biomarkers and effective therapeutics, including optimal timing of such interventions. In this report we summarize the IMPACC study design and protocols including clinical criteria and recruitment, multi-site standardized sample collection and processing, virologic and immunologic assays, harmonization of assay protocols, high-level analyses and the data sharing plans.

2021-08-13

Strategies for Accurate Cell Type Identification in CODEX Multiplexed Imaging Data

Hickey JW, Tan Y, Nolan GP, Goltsev Y

TMC-Stanford

Multiplexed imaging is a recently developed and powerful single-cell biology research tool. However, it presents new sources of technical noise that are distinct from other types of single-cell data, necessitating new practices for single-cell multiplexed imaging processing and analysis, particularly regarding cell-type identification. Here we created single-cell multiplexed imaging datasets by performing CODEX on four sections of the human colon (ascending, transverse, descending, and sigmoid) using a panel of 47 oligonucleotide-barcoded antibodies. After cell segmentation, we implemented five different normalization techniques crossed with four unsupervised clustering algorithms, resulting in 20 unique cell-type annotations for the same dataset. We generated two standard annotations: hand-gated cell types and cell types produced by over-clustering with spatial verification. We then compared these annotations at four levels of cell-type granularity. First, increasing cell-type granularity led to decreased labeling accuracy; therefore, subtle phenotype annotations should be avoided at the clustering step. Second, accuracy in cell-type identification varied more with normalization choice than with clustering algorithm. Third, unsupervised clustering better accounted for segmentation noise during cell-type annotation than hand-gating. Fourth, Z-score normalization was generally effective in mitigating the effects of noise from single-cell multiplexed imaging. Variation in cell-type identification will lead to significant differential spatial results such as cellular neighborhood analysis; consequently, we also make recommendations for accurately assigning cell-type labels to CODEX multiplexed imaging.

2021-08-27

α-Cyano-4-hydroxycinnamic Acid and Tri-Potassium Citrate Salt Pre-Coated Silicon Nanopost Array Provides Enhanced Lipid Detection for High Spatial Resolution MALDI Imaging Mass Spectrometry

Dufresne M, Fincher JA, Patterson NH, Schey KL, Norris JL, Caprioli RM, Spraggins JM

TMC-Vanderbilt (Eye/pancreas)

We have developed a pre-coated substrate for matrix-assisted laser desorption/ionization (MALDI) imaging mass spectrometry (IMS) that enables high spatial resolution mapping of both phospholipids and neutral lipid classes in positive ion mode as metal cation adducts. The MALDI substrates are constructed by depositing a layer of α-cyano-4-hydroxycinnamic acid (CHCA) and potassium salts onto silicon nanopost arrays (NAPA) prior to tissue mounting. The matrix/salt pre-coated NAPA substrate significantly enhances all detected lipid signals allowing lipids to be detected at lower laser energies than bare NAPA. The improved sensitivity at lower laser energy enabled ion images to be generated at 10 μm spatial resolution from rat retinal tissue. Optimization of matrix pre-coated NAPA consisted of testing lithium, sodium, and potassium salts along with various matrices to investigate the increased sensitivity toward lipids for MALDI IMS experiments. It was determined that pre-coating NAPA with CHCA and potassium salts before thaw-mounting of tissue resulted in a signal intensity increase of at least 5.8 ± 0.1-fold for phospholipids and 2.0 ± 0.1-fold for neutral lipids compared to bare NAPA. Pre-coating NAPA with matrix and salt also reduced the necessary laser power to achieve desorption/ionization by ∼35%. This reduced the effective diameter of the ablation area from 13 ± 2 μm down to 8 ± 1 μm, enabling high spatial resolution MALDI IMS. Using pre-coated NAPA with CHCA and potassium salts offers a MALDI IMS substrate with broad molecular coverage of lipids in a single polarity that eliminates the need for extensive sample preparation after sectioning.

2021-09-02

Deep learning of gene relationships from single cell time-course expression data

Yuan Y, Bar-Joseph Z

HIVE TC-CMU

Time-course gene-expression data have been widely used to infer regulatory and signaling relationships between genes. Most of the widely used methods for such analysis were developed for bulk expression data. Single cell RNA-Seq (scRNA-Seq) data offer several advantages including the large number of expression profiles available and the ability to focus on individual cells rather than averages. However, the data also raise new computational challenges. Using a novel encoding for scRNA-Seq expression data, we develop deep learning methods for interaction prediction from time-course data. Our methods use a supervised framework which represents the data as 3D tensor and train convolutional and recurrent neural networks for predicting interactions. We tested our time-course deep learning (TDL) models on five different time-series scRNA-Seq datasets. As we show, TDL can accurately identify causal and regulatory gene-gene interactions and can also be used to assign new function to genes. TDL improves on prior methods for the above tasks and can be generally applied to new time-series scRNA-Seq data.

2021-09-02

Spatially Resolved Proteomic Analysis of the Lens Extracellular Diffusion Barrier

Wang Z, Cantrell LS, Schey KL

TMC-Vanderbilt (Eye/pancreas)

Purpose: The presence of a physical barrier to molecular diffusion through lenticular extracellular space has been repeatedly detected. This extracellular diffusion barrier has been proposed to restrict the movement of solutes into the lens and to direct nutrients into the lens core via the sutures at both poles. The purpose of this study is to characterize the molecular components that could contribute to the formation of this barrier. Methods: Three distinct regions in the bovine lens cortex were captured by laser capture microdissection guided by dye penetration. Proteins were digested by Lys C and trypsin. Mass spectrometry-based proteomic analysis followed by gene ontology and protein interaction network analysis was performed. Results: Dye penetration showed that fiber cells first shrink the extracellular spaces of the broad sides followed by closure of the extracellular space between narrow sides at a normalized lens distance (r/a) of 0.9. Accompanying the closure of extracellular space of the broad sides, dramatic proteomic changes were detected, including upregulation of several cell junctional proteins. AQP0 and its interacting partners, Ezrin and Radixin, were among a few proteins that were upregulated, accompanying the closure of extracellular space of the narrow sides, suggesting a particularly important role for AQP0 in controlling the narrowing of the extracellular spaces between fiber cells. The results also provided important information related to biological processes that occur during fiber cell differentiation such as organelle degradation, cytoskeletal remodeling, and glutathione synthesis. Conclusions: The formation of a lens extracellular diffusion barrier is accompanied by significant membrane and cytoskeletal protein remodeling.

2021-09-03

Facile One-Pot Nanoproteomics for Label-Free Proteome Profiling of 50-1000 Mammalian Cells

Martin K, Zhang T, Lin TT, Habowski AN, Zhao R, Tsai CF, Chrisler WB, Sontag RL, Orton DJ, Lu YJ, Rodland KD, Yang B, Liu T, Smith RD, Qian WJ, Waterman ML, Wiley HS, Shi T

TTD-PNNL/Northwestern

Recent advances in sample preparation enable label-free mass spectrometry (MS)-based proteome profiling of small numbers of mammalian cells. However, specific devices are often required to downscale sample processing volume from the standard 50-200 μL to sub-μL for effective nanoproteomics, which greatly impedes the implementation of current nanoproteomics methods by the proteomics research community. Herein, we report a facile one-pot nanoproteomics method termed SOPs-MS (surfactant-assisted one-pot sample processing at the standard volume coupled with MS) for convenient robust proteome profiling of 50-1000 mammalian cells. Building upon our recent development of SOPs-MS for label-free single-cell proteomics at a low μL volume, we have systematically evaluated its processing volume at 10-200 μL using 100 human cells. The processing volume of 50 μL that is in the range of volume for standard proteomics sample preparation has been selected for easy sample handling with a benchtop micropipette. SOPs-MS allows for reliable label-free quantification of ∼1200-2700 protein groups from 50 to 1000 MCF10A cells. When applied to small subpopulations of mouse colon crypt cells, SOPs-MS has revealed protein signatures between distinct subpopulation cells with identification of ∼1500-2500 protein groups for each subpopulation. SOPs-MS may pave the way for routine deep proteome profiling of small numbers of cells and low-input samples.

2021-09-08

Automated biomarker candidate discovery in imaging mass spectrometry data through spatially localized Shapley additive explanations

Tideman LEM, Migas LG, Djambazova KV, Patterson NH, Caprioli RM, Spraggins JM, Van de Plas R

TMC-Vanderbilt (Eye/pancreas)

The search for molecular species that are differentially expressed between biological states is an important step towards discovering promising biomarker candidates. In imaging mass spectrometry (IMS), performing this search manually is often impractical due to the large size and high-dimensionality of IMS datasets. Instead, we propose an interpretable machine learning workflow that automatically identifies biomarker candidates by their mass-to-charge ratios, and that quantitatively estimates their relevance to recognizing a given biological class using Shapley additive explanations (SHAP). The task of biomarker candidate discovery is translated into a feature ranking problem: given a classification model that assigns pixels to different biological classes on the basis of their mass spectra, the molecular species that the model uses as features are ranked in descending order of relative predictive importance such that the top-ranking features have a higher likelihood of being useful biomarkers. Besides providing the user with an experiment-wide measure of a molecular species' biomarker potential, our workflow delivers spatially localized explanations of the classification model's decision-making process in the form of a novel representation called SHAP maps. SHAP maps deliver insight into the spatial specificity of biomarker candidates by highlighting in which regions of the tissue sample each feature provides discriminative information and in which regions it does not. SHAP maps also enable one to determine whether the relationship between a biomarker candidate and a biological state of interest is correlative or anticorrelative. Our automated approach to estimating a molecular species' potential for characterizing a user-provided biological class, combined with the untargeted and multiplexed nature of IMS, allows for the rapid screening of thousands of molecular species and the obtention of a broader biomarker candidate shortlist than would be possible through targeted manual assessment. Our biomarker candidate discovery workflow is demonstrated on mouse-pup and rat kidney case studies.

2021-09-30

Characteristics of p.Gln368Ter Myocilin Variant and Influence of Polygenic Risk on Glaucoma Penetrance in the UK Biobank

Zebardast N, Sekimitsu S, Wang J, Elze T, Gharahkhani P, Cole BS, Lin MM, Segrè AV, Wiggs JL

DP-Harvard

Purpose: MYOC (myocilin) mutations account for 3% to 5% of primary open-angle glaucoma (POAG) cases. We aimed to understand the true population-wide penetrance and characteristics of glaucoma among individuals with the most common MYOC variant (p.Gln368Ter) and the impact of a POAG polygenic risk score (PRS) in this population. Design: Cross-sectional population-based study. Participants: Individuals with the p.Gln368Ter variant among 77 959 UK Biobank participants with fundus photographs (FPs). Methods: A genome-wide POAG PRS was computed, and 2 masked graders reviewed FPs for disc-defined glaucoma (DDG). Main outcome measures: Penetrance of glaucoma. Results: Two hundred individuals carried the p.Gln368Ter heterozygous genotype, and 177 had gradable FPs. One hundred thirty-two showed no evidence of glaucoma, 45 (25.4%) had probable/definite glaucoma in at least 1 eye, and 19 (10.7%) had bilateral glaucoma. No differences were found in age, race/ethnicity, or gender among groups (P > 0.05). Of those with DDG, 31% self-reported or had International Classification of Diseases codes for glaucoma, whereas 69% were undiagnosed. Those with DDG had higher medication-adjusted cornea-corrected intraocular pressure (IOPcc) (P < 0.001) vs. those without glaucoma. This difference in IOPcc was larger in those with DDG with a prior glaucoma diagnosis versus those not diagnosed (P < 0.001). Most p.Gln368Ter carriers showed IOP in the normal range (≤21 mmHg), although this proportion was lower in those with DDG (P < 0.02) and those with prior glaucoma diagnosis (P < 0.03). Prevalence of DDG increased with each decile of POAG PRS. Individuals with DDG demonstrated significantly higher PRS compared with those without glaucoma (0.37 ± 0.97 vs. 0.01 ± 0.90; P = 0.03). Of those with DDG, individuals with a prior diagnosis of glaucoma had higher PRS compared with undiagnosed individuals (1.31 ± 0.64 vs. 0.00 ± 0.81; P < 0.001) and 27.5 times (95% confidence interval, 2.5-306.6) adjusted odds of being in the top decile of PRS for POAG. Conclusions: One in 4 individuals with the MYOC p.Gln368Ter mutation demonstrated evidence of glaucoma, a substantially higher penetrance than previously estimated, with 69% of cases undetected. A large portion of p.Gln368Ter carriers, including those with DDG, have IOP in the normal range, despite similar age. Polygenic risk score increases disease penetrance and severity, supporting the usefulness of PRS in risk stratification among MYOC p.Gln368Ter carriers.

2021-09-30

Acceleration of age-induced proteolysis in the guinea pig lens nucleus by in vivo exposure to hyperbaric oxygen: A mass spectrometry analysis

Giblin FJ, Anderson DMG, Han J, Rose KL, Wang Z, Schey KL

TMC-Vanderbilt (Eye/pancreas)

Hyperbaric oxygen (HBO) treatment of animals or ocular lenses in culture recapitulates many molecular changes observed in human age-related nuclear cataract. The guinea pig HBO model has been one of the best examples of such treatment leading to dose-dependent development of lens nuclear opacities. In this study, complimentary mass spectrometry methods were employed to examine protein truncation after HBO treatment of aged guinea pigs. Quantitative liquid chromatography-mass spectrometry (LC-MS) analysis of the membrane fraction of guinea pig lenses showed statistically significant increases in aquaporin-0 (AQP0) C-terminal truncation, consistent with previous reports of accelerated loss of membrane and cytoskeletal proteins. In addition, imaging mass spectrometry (IMS) analysis spatially mapped the acceleration of age-related αA-crystallin truncation in the lens nucleus. The truncation sites in αA-crystallin closely match those observed in human lenses with age. Taken together, our results suggest that HBO accelerates the normal lens aging process and leads to nuclear cataract.

2021-10-04

Computational tools for analyzing single-cell data in pluripotent cell differentiation studies

Ding J, Alavi A, Ebrahimkhani MR, Bar-Joseph Z

HIVE TC-CMU

Single-cell technologies are revolutionizing the ability of researchers to infer the causes and results of biological processes. Although several studies of pluripotent cell differentiation have recently utilized single-cell sequencing data, other aspects related to the optimization of differentiation protocols, their validation, robustness, and usage are still not taking full advantage of single-cell technologies. In this review, we focus on computational approaches for the analysis of single-cell omics and imaging data and discuss their use to address many of the major challenges involved in the development, validation, and use of cells obtained from pluripotent cell differentiation.

2021-10-14

Editorial: Footprints of Immune Cells in the Type 1 Diabetic Pancreas

Brusko TM, Mallone R, Rodriguez-Calvo T

TMC-Florida

2021-10-27

3D virtual reality vs. 2D desktop registration user interface comparison

Bueckle A, Buehling K, Shih PC, Börner K

HIVE MC-IU

Working with organs and extracted tissue blocks is an essential task in many medical surgery and anatomy environments. In order to prepare specimens from human donors for further analysis, wet-bench workers must properly dissect human tissue and collect metadata for downstream analysis, including information about the spatial origin of tissue. The Registration User Interface (RUI) was developed to allow stakeholders in the Human Biomolecular Atlas Program (HuBMAP) to register tissue blocks-i.e., to record the size, position, and orientation of human tissue data with regard to reference organs. The RUI has been used by tissue mapping centers across the HuBMAP consortium to register a total of 45 kidney, spleen, and colon tissue blocks, with planned support for 17 organs in the near future. In this paper, we compare three setups for registering one 3D tissue block object to another 3D reference organ (target) object. The first setup is a 2D Desktop implementation featuring a traditional screen, mouse, and keyboard interface. The remaining setups are both virtual reality (VR) versions of the RUI: VR Tabletop, where users sit at a physical desk which is replicated in virtual space; VR Standup, where users stand upright while performing their tasks. All three setups were implemented using the Unity game engine. We then ran a user study for these three setups involving 42 human subjects completing 14 increasingly difficult and then 30 identical tasks in sequence and reporting position accuracy, rotation accuracy, completion time, and satisfaction. All study materials were made available in support of future study replication, alongside videos documenting our setups. We found that while VR Tabletop and VR Standup users are about three times as fast and about a third more accurate in terms of rotation than 2D Desktop users (for the sequence of 30 identical tasks), there are no significant differences between the three setups for position accuracy when normalized by the height of the virtual kidney across setups. When extrapolating from the 2D Desktop setup with a 113-mm-tall kidney, the absolute performance values for the 2D Desktop version (22.6 seconds per task, 5.88 degrees rotation, and 1.32 mm position accuracy after 8.3 tasks in the series of 30 identical tasks) confirm that the 2D Desktop interface is well-suited for allowing users in HuBMAP to register tissue blocks at a speed and accuracy that meets the needs of experts performing tissue dissection. In addition, the 2D Desktop setup is cheaper, easier to learn, and more practical for wet-bench environments than the VR setups.

2021-10-28

Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram

Biancalani T, Scalia G, Buffoni L, Avasthi R, Lu Z, Sanger A, Tokcan N, Vanderburg CR, Segerstolpe Å, Zhang M, Avraham-Davidi I, Vickovic S, Nitzan M, Ma S, Subramanian A, Lipinski M, Buenrostro J, Brown NB, Fanelli D, Zhuang X, Macosko EZ, Regev A

HIVE MC-NYGC

Charting an organs' biological atlas requires us to spatially resolve the entire single-cell transcriptome, and to relate such cellular features to the anatomical scale. Single-cell and single-nucleus RNA-seq (sc/snRNA-seq) can profile cells comprehensively, but lose spatial information. Spatial transcriptomics allows for spatial measurements, but at lower resolution and with limited sensitivity. Targeted in situ technologies solve both issues, but are limited in gene throughput. To overcome these limitations we present Tangram, a method that aligns sc/snRNA-seq data to various forms of spatial data collected from the same region, including MERFISH, STARmap, smFISH, Spatial Transcriptomics (Visium) and histological images. Tangram can map any type of sc/snRNA-seq data, including multimodal data such as those from SHARE-seq, which we used to reveal spatial patterns of chromatin accessibility. We demonstrate Tangram on healthy mouse brain tissue, by reconstructing a genome-wide anatomically integrated spatial map at single-cell resolution of the visual and somatomotor areas.

2021-10-31

Advances in spatial transcriptomic data analysis

Dries R, Chen J, Del Rossi N, Khan MM, Sistig A, Yuan GC.

TTD-Cal Tech

Spatial transcriptomics is a rapidly growing field that promises to comprehensively characterize tissue organization and architecture at the single-cell or subcellular resolution. Such information provides a solid foundation for mechanistic understanding of many biological processes in both health and disease that cannot be obtained by using traditional technologies. The development of computational methods plays important roles in extracting biological signals from raw data. Various approaches have been developed to overcome technology-specific limitations such as spatial resolution, gene coverage, sensitivity, and technical biases. Downstream analysis tools formulate spatial organization and cell–cell communications as quantifiable properties, and provide algorithms to derive such properties. Integrative pipelines further assemble multiple tools in one package, allowing biologists to conveniently analyze data from beginning to end. In this review, we summarize the state of the art of spatial transcriptomic data analysis methods and pipelines, and discuss how they operate on different technological platforms.

2021-11-01

In-depth triacylglycerol profiling using MS3 Q-Trap mass spectrometry.

Cabruja M, Priotti J, Domizi P, Papsdorf K, Kroetz DL, Brunet A, Contrepois K, Snyder MP

TMC-Stanford

Total triacylglycerol (TAG) level is a key clinical marker of metabolic and cardiovascular diseases. However, the roles of individual TAGs have not been thoroughly explored in part due to their extreme structural complexity. We present a targeted mass spectrometry-based method combining multiple reaction monitoring (MRM) and multiple stage mass spectrometry (MS³) for the comprehensive qualitative and semiquantitative profiling of TAGs. This method referred as TriP-MS3 - triacylglycerol profiling using MS³ - screens for more than 6,700 TAG species in a fully automated fashion. TriP-MS3 demonstrated excellent reproducibility (median interday CV ∼ 0.15) and linearity (median R² = 0.978) and detected 285 individual TAG species in human plasma. The semiquantitative accuracy of the method was validated by comparison with a state-of-the-art reverse phase liquid chromatography (RPLC)-MS (R² = 0.83), which is the most commonly used approach for TAGs profiling. Finally, we demonstrate the utility and the versatility of the method by characterizing the effects of a fatty acid desaturase inhibitor on TAG profiles in vitro and by profiling TAGs in Caenorhabditis elegans.

2021-11-01

Single-cell chromatin state analysis with Signac

Stuart T, Srivastava A, Madad S, Lareau CA, Satija R

HIVE MC-NYGC

The recent development of experimental methods for measuring chromatin state at single-cell resolution has created a need for computational tools capable of analyzing these datasets. Here we developed Signac, a comprehensive toolkit for the analysis of single-cell chromatin data. Signac enables an end-to-end analysis of single-cell chromatin data, including peak calling, quantification, quality control, dimension reduction, clustering, integration with single-cell gene expression datasets, DNA motif analysis and interactive visualization. Through its seamless compatibility with the Seurat package, Signac facilitates the analysis of diverse multimodal single-cell chromatin data, including datasets that co-assay DNA accessibility with gene expression, protein abundance and mitochondrial genotype. We demonstrate scaling of the Signac framework to analyze datasets containing over 700,000 cells.

2021-11-01

Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads

Shafin K, Pesout T, Chang PC, Nattestad M, Kolesnikov A, Goel S, Baid G, Kolmogorov M, Eizenga JM, Miga KH, Carnevali P, Jain M, Carroll A, Paten B

HIVE TC-CMU

Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).

2021-11-08

Cell type ontologies of the Human Cell Atlas

Osumi-Sutherland D, Xu C, Keays M, Levine AP, Kharchenko PV, Regev A, Lein E, Teichmann SA

HIVE TC-CMU

Massive single-cell profiling efforts have accelerated our discovery of the cellular composition of the human body while at the same time raising the need to formalize this new knowledge. Here, we discuss current efforts to harmonize and integrate different sources of annotations of cell types and states into a reference cell ontology. We illustrate with examples how a unified ontology can consolidate and advance our understanding of cell types across scientific communities and biological domains.

2021-11-08

Anatomical structures, cell types and biomarkers of the Human Reference Atlas

Börner K, Teichmann SA, Quardokus EM, Gee JC, Browne K, Osumi-Sutherland D, Herr BW 2nd, Bueckle A, Paul H, Haniffa M, Jardine L, Bernard A, Ding SL, Miller JA, Lin S, Halushka MK, Boppana A, Longacre TA, Hickey J, Lin Y, Valerius MT, He Y, Pryhuber G, Sun X, Jorgensen M, Radtke AJ, Wasserfall C, Ginty F, Ho J, Sunshine J, Beuschel RT, Brusko M, Lee S, Malhotra R, Jain S, Weber G

HIVE MC-IU

The Human Reference Atlas (HRA) aims to map all of the cells of the human body to advance biomedical research and clinical practice. This Perspective presents collaborative work by members of 16 international consortia on two essential and interlinked parts of the HRA: (1) three-dimensional representations of anatomy that are linked to (2) tables that name and interlink major anatomical structures, cell types, plus biomarkers (ASCT+B). We discuss four examples that demonstrate the practical utility of the HRA.

2021-11-11

Towards inferring nanopore sequencing ionic currents from nucleotide chemical structures

Ding H, Anastopoulos I, Bailey AD 4th, Stuart J, Paten B

HIVE TC-CMU

The characteristic ionic currents of nucleotide kmers are commonly used in analyzing nanopore sequencing readouts. We present a graph convolutional network-based deep learning framework for predicting kmer characteristic ionic currents from corresponding chemical structures. We show such a framework can generalize the chemical information of the 5-methyl group from thymine to cytosine by correctly predicting 5-methylcytosine-containing DNA 6mers, thus shedding light on the de novo detection of nucleotide modifications.

2021-11-18

Immune cell topography predicts response to PD-1 blockade in cutaneous T cell lymphoma

Phillips D, Matusiak M, Gutierrez BR, Bhate SS, Barlow GL, Jiang S, Demeter J, Smythe KS, Pierce RH, Fling SP, Ramchurren N, Cheever MA, Goltsev Y, West RB, Khodadoust MS, Kim YH, Schürch CM, Nolan GP

TMC-Stanford

Cutaneous T cell lymphomas (CTCL) are rare but aggressive cancers without effective treatments. While a subset of patients derive benefit from PD-1 blockade, there is a critically unmet need for predictive biomarkers of response. Herein, we perform CODEX multiplexed tissue imaging and RNA sequencing on 70 tumor regions from 14 advanced CTCL patients enrolled in a pembrolizumab clinical trial (NCT02243579). We find no differences in the frequencies of immune or tumor cells between responders and non-responders. Instead, we identify topographical differences between effector PD-1+ CD4+ T cells, tumor cells, and immunosuppressive Tregs, from which we derive a spatial biomarker, termed the SpatialScore, that correlates strongly with pembrolizumab response in CTCL. The SpatialScore coincides with differences in the functional immune state of the tumor microenvironment, T cell function, and tumor cell-specific chemokine recruitment and is validated using a simplified, clinically accessible tissue imaging platform. Collectively, these results provide a paradigm for investigating the spatial balance of effector and suppressive T cell activity and broadly leveraging this biomarker approach to inform the clinical use of immunotherapies.

2021-11-23

Scalable dual-omics profiling with single-nucleus chromatin accessibility and mRNA expression sequencing 2 (SNARE-seq2)

Plongthongkum N, Diep D, Chen S, Lake BB, Zhang K.

TMC-UCSD

Comprehensive characterization of cellular heterogeneity and the underlying regulatory landscapes of tissues and organs requires a highly robust and scalable method to acquire matched RNA and chromatin accessibility profiles on the same cells. Here, we describe a single-nucleus chromatin accessibility and mRNA expression sequencing 2 (SNARE-seq2) assay, implemented with cellular combinatorial indexing. This method involves tagmentation within permeabilized and fixed single-nucleus isolates to capture accessible chromatin (AC) regions, followed by the capture and reverse transcription of RNA transcripts. Through combinatorial split pool ligations, cDNA and AC within each single nucleus become appended with a common cell barcode combination. The captured cDNA and AC are then co-amplified before splitting and enrichment into single-nucleus RNA and single-nucleus AC sequencing libraries. This protocol is compatible with both nuclei and whole cells and can be completed in 3.5 d. SNARE-seq2 permits robust generation of high-quality, joint single-cell RNA and AC sequencing libraries from hundreds of thousands of single cells per experiment.

2021-11-26

Self-supervised clustering of mass spectrometry imaging data using contrastive learning

Hu H, Bindu JP, Laskin J

TTD-Purdue

Mass spectrometry imaging (MSI) is widely used for the label-free molecular mapping of biological samples. The identification of co-localized molecules in MSI data is crucial to the understanding of biochemical pathways. One of key challenges in molecular colocalization is that complex MSI data are too large for manual annotation but too small for training deep neural networks. Herein, we introduce a self-supervised clustering approach based on contrastive learning, which shows an excellent performance in clustering of MSI data. We train a deep convolutional neural network (CNN) using MSI data from a single experiment without manual annotations to effectively learn high-level spatial features from ion images and classify them based on molecular colocalizations. We demonstrate that contrastive learning generates ion image representations that form well-resolved clusters. Subsequent self-labeling is used to fine-tune both the CNN encoder and linear classifier based on confidently classified ion images. This new approach enables autonomous and high-throughput identification of co-localized species in MSI data, which will dramatically expand the application of spatial lipidomics, metabolomics, and proteomics in biological research.

2021-12-01

A Pilot Study of Urine Proteomics in Covid-19-associated Acute Kidney Injury

Ye Y, Swensen AC, Wang Y, Kaushal M, Salamon D, Knoten A, NicoraCD, Marks L, Gaut JP, Vijayan A, Orton DJ, Mudd PA, Parikh CR, Qian WJ, O’Halloran JA, PiehowskiPD, Jain S

TTD-Purdue

Acute kidney injury (AKI) is a major complication associated with COVID-19 and occurs in up to 76% of intensive care unit patients 1, 2. The mortality rate of COVID-19 patients who developed AKI (COVID-AKI) is more than 10 times higher than those who did not. While candidate AKI markers exist, the etiology of COVID-AKI is multifactorial requiring agnostic approaches for identification of analytes early in hospital course to provide insights into biomarkers and mechanisms associated with COVID-AKI and COVID-19 infection. Research on COVID-19-associated effects on the urinary proteome is limited, and kidney dysfunction has not been reported 3. Approximately 70% of the proteins detected in urine are produced in the kidney with a significant amount filtered from blood. We hypothesize that the changes in protein abundances in urine could lead to the discovery of protein markers associated with COVID-19 or COVID-AKI and provide mechanistic insights to improve understanding. Results and Discussion We analyzed urine samples from 14 participants (6 COVID-AKI, 3 COVID-NoAKI and 5 NoCOVID-NoAKI) (Figure 1A, Table S1). To account for the large variation in protein content across urine specimens we utilized a bicinchoninic acid (BCA) to measure peptide concentration after protein digestion. All peptide samples are then normalized to the same concentration prior to analysis to allow for relative quantitation of differences in the urinary proteome. The Urine proteome of COVID-AKI After confirming the quality of urine for analyte discovery (Table S2, Figure S1), we examined if underlying variance could distinguish between AKI+ (6 COVID-AKI) from other samples without AKI (8 AKI-); all AKI samples were from COVID-19 patients. The first two principal components accounted for 42.1% of the variance, and clearly separated the two groups (Figure 1B); thereby

2021-12-06

geneBasis: an iterative approach for unsupervised selection of targeted gene panels from scRNA-seq

Missarova A, Jain J, Butler A, Ghazanfar S, Stuart T, Brusko M, Wasserfall C, Nick H, Brusko T, Atkinson M, Satija R, Marioni JC

HIVE MC-NYGC

scRNA-seq datasets are increasingly used to identify gene panels that can be probed using alternative technologies, such as spatial transcriptomics, where choosing the best subset of genes is vital. Existing methods are limited by a reliance on pre-existing cell type labels or by difficulties in identifying markers of rare cells. We introduce an iterative approach, geneBasis, for selecting an optimal gene panel, where each newly added gene captures the maximum distance between the true manifold and the manifold constructed using the currently selected gene panel. Our approach outperforms existing strategies and can resolve cell types and subtle cell state differences.

2021-12-14

Cross-Laboratory Standardization of Preclinical Lipidomics Using Differential Mobility Spectrometry and Multiple Reaction Monitoring

Ghorasaini M, Mohammed Y, Adamski J, Bettcher L, Bowden JA, Cabruja M, Contrepois K, Ellenberger M, Gajera B, Haid M, Hornburg D, Hunter C, Jones CM, Klein T, Mayboroda O, Mirzaian M, Moaddel R, Ferrucci L, Lovett J, Nazir K, Pearson M, Ubhi BK, Raftery D, Riols F, Sayers R, Sijbrands EJG, Snyder MP, Su B, Velagapudi V, Williams KJ, de Rijke YB, Giera M

TMC-Stanford

Modern biomarker and translational research as well as personalized health care studies rely heavily on powerful omics' technologies, including metabolomics and lipidomics. However, to translate metabolomics and lipidomics discoveries into a high-throughput clinical setting, standardization is of utmost importance. Here, we compared and benchmarked a quantitative lipidomics platform. The employed Lipidyzer platform is based on lipid class separation by means of differential mobility spectrometry with subsequent multiple reaction monitoring. Quantitation is achieved by the use of 54 deuterated internal standards and an automated informatics approach. We investigated the platform performance across nine laboratories using NIST SRM 1950-Metabolites in Frozen Human Plasma, and three NIST Candidate Reference Materials 8231-Frozen Human Plasma Suite for Metabolomics (high triglyceride, diabetic, and African-American plasma). In addition, we comparatively analyzed 59 plasma samples from individuals with familial hypercholesterolemia from a clinical cohort study. We provide evidence that the more practical methyl-tert-butyl ether extraction outperforms the classic Bligh and Dyer approach and compare our results with two previously published ring trials. In summary, we present standardized lipidomics protocols, allowing for the highly reproducible analysis of several hundred human plasma lipids, and present detailed molecular information for potentially disease relevant and ethnicity-related materials.

2021-12-17

Pangenomics enables genotyping of known structural variants in 5202 diverse genomes

Sirén J, Monlong J, Chang X, Novak AM, Eizenga JM, Markello C, Sibbesen JA, Hickey G, Chang PC, Carroll A, Gupta N, Gabriel S, Blackwell TW, Ratan A, Taylor KD, Rich SS, Rotter JI, Haussler D, Garrison E, Paten B

HIVE TC-CMU

We introduce Giraffe, a pangenome short-read mapper that can efficiently map to a collection of haplotypes threaded through a sequence graph. Giraffe maps sequencing reads to thousands of human genomes at a speed comparable to that of standard methods mapping to a single reference genome. The increased mapping accuracy enables downstream improvements in genome-wide genotyping pipelines for both small variants and larger structural variants. We used Giraffe to genotype 167,000 structural variants, discovered in long-read studies, in 5202 diverse human genomes that were sequenced using short reads. We conclude that pangenomics facilitates a more comprehensive characterization of variation and, as a result, has the potential to improve many genomic analyses.

2021-12-22

Tissue fixation effects on human retinal lipid analysis by MALDI imaging and LC-MS/MS technologies

Kotnala A, Anderson DMG, Patterson NH, Cantrell LS, Messinger JD, Curcio CA, Schey KL

TMC-Vanderbilt (Eye/pancreas)

Imaging mass spectrometry (IMS) allows the location and abundance of lipids to be mapped across tissue sections of human retina. For reproducible and accurate information, sample preparation methods need to be optimized. Paraformaldehyde fixation of a delicate multilayer structure like human retina facilitates the preservation of tissue morphology by forming methylene bridge crosslinks between formaldehyde and amine/thiols in biomolecules; however, retina sections analyzed by IMS are typically fresh-frozen. To determine if clinically significant inferences could be reliably based on fixed tissue, we evaluated the effect of fixation on analyte detection, spatial localization, and introduction of artifactual signals. Hence, we assessed the molecular identity of lipids generated by matrix-assisted laser desorption ionization (MALDI-IMS) and liquid chromatography coupled tandem mass spectrometry (LC-MS/MS) for fixed and fresh-frozen retina tissues in positive and negative ion modes. Based on MALDI-IMS analysis, more lipid signals were observed in fixed compared with fresh-frozen retina. More potassium adducts were observed in fresh-frozen tissues than fixed as the fixation process caused displacement of potassium adducts to protonated and sodiated species in ion positive ion mode. LC-MS/MS analysis revealed an overall decrease in lipid signals due to fixation that reduced glycerophospholipids and glycerolipids and conserved most sphingolipids and cholesteryl esters. The high quality and reproducible information from untargeted lipidomics analysis of fixed retina informs on all major lipid classes, similar to fresh-frozen retina, and serves as a steppingstone towards understanding of lipid alterations in retinal diseases.

2022-01-01

Magnetic resonance linear accelerator technology and adaptive radiation therapy: An overview for clinicians

Hall WA, Paulson E, Li XA, Erickson B, Schultz C, Tree A, Awan M, Low DA, McDonald BA, Salzillo T, Glide-Hurst CK, Kishan AU, Fuller CD

HIVE IEC-PSC

Radiation therapy (RT) continues to play an important role in the treatment of cancer. Adaptive RT (ART) is a novel method through which RT treatments are evolving. With the ART approach, computed tomography or magnetic resonance (MR) images are obtained as part of the treatment delivery process. This enables the adaptation of the irradiated volume to account for changes in organ and/or tumor position, movement, size, or shape that may occur over the course of treatment. The advantages and challenges of ART maybe somewhat abstract to oncologists and clinicians outside of the specialty of radiation oncology. ART is positioned to affect many different types of cancer. There is a wide spectrum of hypothesized benefits, from small toxicity improvements to meaningful gains in overall survival. The use and application of this novel technology should be understood by the oncologic community at large, such that it can be appropriately contextualized within the landscape of cancer therapies. Likewise, the need to test these advances is pressing. MR-guided ART (MRgART) is an emerging, extended modality of ART that expands upon and further advances the capabilities of ART. MRgART presents unique opportunities to iteratively improve adaptive image guidance. However, although the MRgART adaptive process advances ART to previously unattained levels, it can be more expensive, time-consuming, and complex. In this review, the authors present an overview for clinicians describing the process of ART and specifically MRgART.

2022-01-03

Highly multiplexed immunofluorescence of the human kidney using co-detection by indexing

Neumann EK, Patterson NH, Rivera ES, Allen JL, Brewer M, deCaestecker MP, Caprioli RM, Fogo AB, Spraggins JM.

TMC-Vanderbilt (Kidney)

The human kidney is composed of many cell types that vary in their abundance and distribution from normal to diseased organ. As these cell types perform unique and essential functions, it is important to confidently label each within a single tissue to accurately assess tissue architecture and microenvironments. Towards this goal, we demonstrate the use of co-detection by indexing (CODEX) multiplexed immunofluorescence for visualizing 23 antigens within the human kidney. Using CODEX, many of the major cell types and substructures, such as collecting ducts, glomeruli, and thick ascending limb, were visualized within a single tissue section. Of these antibodies, 19 were conjugated in-house, demonstrating the flexibility and utility of this approach for studying the human kidney using custom and commercially available antibodies. We performed a pilot study that compared both fresh frozen and formalin-fixed paraffin-embedded healthy non-neoplastic and diabetic nephropathy kidney tissues. The largest cellular differences between the two groups was observed in cells labeled with aquaporin 1, cytokeratin 7, and α-smooth muscle actin. Thus, our data show the power of CODEX multiplexed immunofluorescence for surveying the cellular diversity of the human kidney and the potential for applications within pathology, histology, and building anatomical atlases.

2022-01-10

Integration of spatial and single-cell transcriptomic data elucidates mouse organogenesis

Lohoff T, Ghazanfar S, Missarova A, Koulena N, Pierson N, Griffiths JA, Bardot ES, Eng CL, Tyser RCV, Argelaguet R, Guibentif C, Srinivas S, Briscoe J, Simons BD, Hadjantonakis AK, Göttgens B, Reik W, Nichols J, Cai L, Marioni JC

HIVE MC-NYGC

Molecular profiling of single cells has advanced our knowledge of the molecular basis of development. However, current approaches mostly rely on dissociating cells from tissues, thereby losing the crucial spatial context of regulatory processes. Here, we apply an image-based single-cell transcriptomics method, sequential fluorescence in situ hybridization (seqFISH), to detect mRNAs for 387 target genes in tissue sections of mouse embryos at the 8-12 somite stage. By integrating spatial context and multiplexed transcriptional measurements with two single-cell transcriptome atlases, we characterize cell types across the embryo and demonstrate that spatially resolved expression of genes not profiled by seqFISH can be imputed. We use this high-resolution spatial map to characterize fundamental steps in the patterning of the midbrain-hindbrain boundary (MHB) and the developing gut tube. We uncover axes of cell differentiation that are not apparent from single-cell RNA-sequencing (scRNA-seq) data, such as early dorsal-ventral separation of esophageal and tracheal progenitor populations in the gut tube. Our method provides an approach for studying cell fate decisions in complex tissues and development.

2022-01-10

A census of the lung: CellCards from LungMAP

Sun X, Perl AK, Li R, Bell SM, Sajti E, Kalinichenko VV, Kalin TV, Misra RS, Deshmukh H, Clair G, Kyle J, Crotty Alexander LE, Masso-Silva JA, Kitzmiller JA, Wikenheiser-Brokamp KA, Deutsch G, Guo M, Du Y, Morley MP, Valdez MJ, Yu HV, Jin K, Bardes EE, Zepp JA, Neithamer T, Basil MC, Zacharias WJ, Verheyden J, Young R, Bandyopadhyay G, Lin S, Ansong C, Adkins J, Salomonis N, Aronow BJ, Xu Y, Pryhuber G, Whitsett J, Morrisey EE, NHLBI LungMAP Consortium

TMC-UCSD

The human lung plays vital roles in respiration, host defense, and basic physiology. Recent technological advancements such as single-cell RNA sequencing and genetic lineage tracing have revealed novel cell types and enriched functional properties of existing cell types in lung. The time has come to take a new census. Initiated by members of the NHLBI-funded LungMAP Consortium and aided by experts in the lung biology community, we synthesized current data into a comprehensive and practical cellular census of the lung. Identities of cell types in the normal lung are captured in individual cell cards with delineation of function, markers, developmental lineages, heterogeneity, regenerative potential, disease links, and key experimental tools. This publication will serve as the starting point of a live, up-to-date guide for lung research at https://www.lungmap.net/cell-cards/. We hope that Lung CellCards will promote the community-wide effort to establish, maintain, and restore respiratory health.

2022-01-18

Comparison and evaluation of statistical error models for scRNA-seq

Choudhary S, Satija R.

HIVE MC-NYGC

Background: Heterogeneity in single-cell RNA-seq (scRNA-seq) data is driven by multiple sources, including biological variation in cellular state as well as technical variation introduced during experimental processing. Deconvolving these effects is a key challenge for preprocessing workflows. Recent work has demonstrated the importance and utility of count models for scRNA-seq analysis, but there is a lack of consensus on which statistical distributions and parameter settings are appropriate. Results: Here, we analyze 59 scRNA-seq datasets that span a wide range of technologies, systems, and sequencing depths in order to evaluate the performance of different error models. We find that while a Poisson error model appears appropriate for sparse datasets, we observe clear evidence of overdispersion for genes with sufficient sequencing depth in all biological systems, necessitating the use of a negative binomial model. Moreover, we find that the degree of overdispersion varies widely across datasets, systems, and gene abundances, and argues for a data-driven approach for parameter estimation. Conclusions: Based on these analyses, we provide a set of recommendations for modeling variation in scRNA-seq data, particularly when using generalized linear models or likelihood-based approaches for preprocessing and downstream analysis.

2022-01-20

The immunoregulatory landscape of human tuberculosis granulomas

McCaffrey EF, Donato M, Keren L, Chen Z, Delmastro A, Fitzpatrick MB, Gupta S, Greenwald NF, Baranski A, Graf W, Kumar R, Bosse M, Fullaway CC, Ramdial PK, Forgó E, Jojic V, Van Valen D, Mehra S, Khader SA, Bendall SC, van de Rijn M, Kalman D, Kaushal D, Hunter RL, Banaei N, Steyn AJC, Khatri P, Angelo M

RTI-Stanford

Tuberculosis (TB) in humans is characterized by formation of immune-rich granulomas in infected tissues, the architecture and composition of which are thought to affect disease outcome. However, our understanding of the spatial relationships that control human granulomas is limited. Here, we used multiplexed ion beam imaging by time of flight (MIBI-TOF) to image 37 proteins in tissues from patients with active TB. We constructed a comprehensive atlas that maps 19 cell subsets across 8 spatial microenvironments. This atlas shows an IFN-γ-depleted microenvironment enriched for TGF-β, regulatory T cells and IDO1+ PD-L1+ myeloid cells. In a further transcriptomic meta-analysis of peripheral blood from patients with TB, immunoregulatory trends mirror those identified by granuloma imaging. Notably, PD-L1 expression is associated with progression to active TB and treatment response. These data indicate that in TB granulomas, there are local spatially coordinated immunoregulatory programs with systemic manifestations that define active TB.

2022-01-20

Transition to invasive breast cancer is associated with progressive changes in the structure and composition of tumor stroma

Risom T, Glass DR, Averbukh I, Liu CC, Baranski A, Kagel A, McCaffrey EF, Greenwald NF, Rivero-Gutiérrez B, Strand SH, Varma S, Kong A, Keren L, Srivastava S, Zhu C, Khair Z, Veis DJ, Deschryver K, Vennam S, Maley C, Hwang ES, Marks JR, Bendall SC, Colditz GA, West RB, Angelo M

RTI-Stanford

Ductal carcinoma in situ (DCIS) is a pre-invasive lesion that is thought to be a precursor to invasive breast cancer (IBC). To understand the changes in the tumor microenvironment (TME) accompanying transition to IBC, we used multiplexed ion beam imaging by time of flight (MIBI-TOF) and a 37-plex antibody staining panel to interrogate 79 clinically annotated surgical resections using machine learning tools for cell segmentation, pixel-based clustering, and object morphometrics. Comparison of normal breast with patient-matched DCIS and IBC revealed coordinated transitions between four TME states that were delineated based on the location and function of myoepithelium, fibroblasts, and immune cells. Surprisingly, myoepithelial disruption was more advanced in DCIS patients that did not develop IBC, suggesting this process could be protective against recurrence. Taken together, this HTAN Breast PreCancer Atlas study offers insight into drivers of IBC relapse and emphasizes the importance of the TME in regulating these processes.

2022-01-28

The Blood Proteoform Atlas: A reference map of proteoforms in human hematopoietic cells

Melani RD, Gerbasi VR, Anderson LC, Sikora JW, Toby TK, Hutton JE, Butcher DS, Negrão F, Seckler HS, Srzentić K, Fornelli L, Camarillo JM, LeDuc RD, Cesnik AJ, Lundberg E, Greer JB, Fellers RT, Robey MT, DeHart CJ, Forte E, Hendrickson CL, Abbatiello SE, Thomas PM, Kokaji AI, Levitsky J, Kelleher NL

RTI-Northwestern

Human biology is tightly linked to proteins, yet most measurements do not precisely determine alternatively spliced sequences or posttranslational modifications. Here, we present the primary structures of ~30,000 unique proteoforms, nearly 10 times more than in previous studies, expressed from 1690 human genes across 21 cell types and plasma from human blood and bone marrow. The results, compiled in the Blood Proteoform Atlas (BPA), indicate that proteoforms better describe protein-level biology and are more specific indicators of differentiation than their corresponding proteins, which are more broadly expressed across cell types. We demonstrate the potential for clinical application, by interrogating the BPA in the context of liver transplantation and identifying cell and proteoform signatures that distinguish normal graft function from acute rejection and other causes of graft dysfunction.

2022-01-31

Spatial genomics enables multi-modal study of clonal heterogeneity in tissues

Zhao T, Chiang ZD, Morriss JW, LaFave LM, Murray EM, Del Priore I, Meli K, Lareau CA, Nadaf NM, Li J, Earl AS, Macosko EZ, Jacks T, Buenrostro JD, Chen F

RTI-Broad

The state and behaviour of a cell can be influenced by both genetic and environmental factors. In particular, tumour progression is determined by underlying genetic aberrations^1-4 as well as the makeup of the tumour microenvironment^5,6. Quantifying the contributions of these factors requires new technologies that can accurately measure the spatial location of genomic sequence together with phenotypic readouts. Here we developed slide-DNA-seq, a method for capturing spatially resolved DNA sequences from intact tissue sections. We demonstrate that this method accurately preserves local tumour architecture and enables the de novo discovery of distinct tumour clones and their copy number alterations. We then apply slide-DNA-seq to a mouse model of metastasis and a primary human cancer, revealing that clonal populations are confined to distinct spatial regions. Moreover, through integration with spatial transcriptomics, we uncover distinct sets of genes that are associated with clone-specific genetic aberrations, the local tumour microenvironment, or both. Together, this multi-modal spatial genomics approach provides a versatile platform for quantifying how cell-intrinsic and cell-extrinsic factors contribute to gene expression, protein abundance and other cellular phenotypes.

2022-02-01

rPAC: Route based pathway analysis for cohorts of gene expression data sets

Joshi P, Basso B, Wang H, Hong SH, Giardina C, Shin DG

TMC-UConn/Scripps

Pathway analysis is a popular method aiming to derive biological interpretation from high-throughput gene expression studies. However, existing methods focus mostly on identifying which pathway or pathways could have been perturbed, given differential gene expression patterns. In this paper, we present a novel pathway analysis framework, namely rPAC, which decomposes each signaling pathway route into two parts, the upstream portion of a transcription factor (TF) block and the downstream portion from the TF block and generates a pathway route perturbation analysis scheme examining disturbance scores assigned to both parts together. This rPAC scoring is further applied to a cohort of gene expression data sets which produces two summary metrics, "Proportion of Significance" (PS) and "Average Route Score" (ARS), as quantitative measures discerning perturbed pathway routes within and/or between cohorts. To demonstrate rPAC's scoring competency, we first used a large amount of simulated data and compared the method's performance against those by conventional methods in terms of power curve. Next, we performed a case study involving three epithelial cancer data sets from The Cancer Genome Atlas (TCGA). The rPAC method revealed specific pathway routes as potential cancer type signatures. A deeper pathway analysis of sub-groups (i.e., age groups in COAD or cancer sub-types in BRCA) resulted in pathway routes that are known to be associated with the sub-groups. In addition, multiple previously uncharacterized pathways routes were identified, potentially suggesting that rPAC is better in deciphering etiology of a disease than conventional methods particularly in isolating routes and sections of perturbed pathways in a finer granularity.

2022-02-11

Spatial-CUT&Tag: Spatially resolved chromatin modification profiling at the cellular level

Deng Y, Bartosovic M, Kukanja P, Zhang D, Liu Y, Su G, Enninful A, Bai Z, Castelo-Branco G, Fan R

TTD-Yale

Spatial omics emerged as a new frontier of biological and biomedical research. Here, we present spatial-CUT&Tag for spatially resolved genome-wide profiling of histone modifications by combining in situ CUT&Tag chemistry, microfluidic deterministic barcoding, and next-generation sequencing. Spatially resolved chromatin states in mouse embryos revealed tissue-type-specific epigenetic regulations in concordance with ENCODE references and provide spatial information at tissue scale. Spatial-CUT&Tag revealed epigenetic control of the cortical layer development and spatial patterning of cell types determined by histone modification in mouse brain. Single-cell epigenomes can be derived in situ by identifying 20-micrometer pixels containing only one nucleus using immunofluorescence imaging. Spatial chromatin modification profiling in tissue may offer new opportunities to study epigenetic regulation, cell function, and fate decision in normal physiology and pathogenesis.

2022-02-11

Uncovering Molecular Heterogeneity in the Kidney With Spatially Targeted Mass Spectrometry

Kruse ARS, Spraggins JM

TMC-Vanderbilt (Eye/pancreas)

The kidney functions through the coordination of approximately one million multifunctional nephrons in 3-dimensional space. Molecular understanding of the kidney has relied on transcriptomic, proteomic, and metabolomic analyses of kidney homogenate, but these approaches do not resolve cellular identity and spatial context. Mass spectrometry analysis of isolated cells retains cellular identity but not information regarding its cellular neighborhood and extracellular matrix. Spatially targeted mass spectrometry is uniquely suited to molecularly characterize kidney tissue while retaining in situ cellular context. This review summarizes advances in methodology and technology for spatially targeted mass spectrometry analysis of kidney tissue. Profiling technologies such as laser capture microdissection (LCM) coupled to liquid chromatography tandem mass spectrometry provide deep molecular coverage of specific tissue regions, while imaging technologies such as matrix assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) molecularly profile regularly spaced tissue regions with greater spatial resolution. These technologies individually have furthered our understanding of heterogeneity in nephron regions such as glomeruli and proximal tubules, and their combination is expected to profoundly expand our knowledge of the kidney in health and disease.

2022-03-01

Spatial mapping of protein composition and tissue organization: a primer for multiplexed antibody-based imaging

Hickey JW, Neumann EK, Radtke AJ, Camarillo JM, Beuschel RT, Albanese A, McDonough E, Hatler J, Wiblin AE, Fisher J, Croteau J, Small EC, Sood A, Caprioli RM, Angelo RM, Nolan GP, Chung K, Hewitt SM, Germain RN, Spraggins JM, Lundberg E, Snyder MP, Kelleher NL, Saka SK

TMC-Stanford

Tissues and organs are composed of distinct cell types that must operate in concert to perform physiological functions. Efforts to create high-dimensional biomarker catalogs of these cells have been largely based on single-cell sequencing approaches, which lack the spatial context required to understand critical cellular communication and correlated structural organization. To probe in situ biology with sufficient depth, several multiplexed protein imaging methods have been recently developed. Though these technologies differ in strategy and mode of immunolabeling and detection tags, they commonly utilize antibodies directed against protein biomarkers to provide detailed spatial and functional maps of complex tissues. As these promising antibody-based multiplexing approaches become more widely adopted, new frameworks and considerations are critical for training future users, generating molecular tools, validating antibody panels, and harmonizing datasets. In this Perspective, we provide essential resources, key considerations for obtaining robust and reproducible imaging data, and specialized knowledge from domain experts and technology developers.

2022-03-07

MITI minimum information guidelines for highly multiplexed tissue images

Schapiro D, Yapp C, Sokolov A, Reynolds SM, Chen YA, Sudar D, Xie Y, Muhlich J, Arias-Camison R, Arena S, Taylor AJ, Nikolov M, Tyler M, Lin JR, Burlingame EA; Human Tumor Atlas Network, Chang YH, Farhi SL, Thorsson V, Venkatamohan N, Drewes JL, Pe'er D, Gutman DA, Herrmann MD, Gehlenborg N, Bankhead P, Roland JT, Herndon JM, Snyder MP, Angelo M, Nolan G, Swedlow JR, Schultz N, Merrick DT, Mazzili SA, Cerami E, Rodig SJ, Santagata S, Sorger PK

HIVE TC-Harvard

N/A

2022-03-07

TraSig: inferring cell-cell interactions from pseudotime ordering of scRNA-Seq data

Li D, Velazquez JJ, Ding J, Hislop J, Ebrahimkhani MR, Bar-Joseph Z

HIVE TC-CMU

A major advantage of single cell RNA-sequencing (scRNA-Seq) data is the ability to reconstruct continuous ordering and trajectories for cells. Here we present TraSig, a computational method for improving the inference of cell-cell interactions in scRNA-Seq studies that utilizes the dynamic information to identify significant ligand-receptor pairs with similar trajectories, which in turn are used to score interacting cell clusters. We applied TraSig to several scRNA-Seq datasets and obtained unique predictions that improve upon those identified by prior methods. Functional experiments validate the ability of TraSig to identify novel signaling interactions that impact vascular development in liver organoids.Software https://github.com/doraadong/TraSig .

2022-03-09

A single-cell regulatory map of postnatal lung alveologenesis in humans and mice

Duong TE, Wu Y, Sos BC, Dong W, Limaye S, Rivier LH, Myers G, Hagood JS, Zhang K.

TMC-UCSD

Ex-utero regulation of the lungs' responses to breathing air and continued alveolar development shape adult respiratory health. Applying single-cell transposome hypersensitive site sequencing (scTHS-seq) to over 80,000 cells, we assembled the first regulatory atlas of postnatal human and mouse lung alveolar development. We defined regulatory modules and elucidated new mechanistic insights directing alveolar septation, including alveolar type 1 and myofibroblast cell signaling and differentiation, and a unique human matrix fibroblast population. Incorporating GWAS, we mapped lung function causal variants to myofibroblasts and identified a pathogenic regulatory unit linked to lineage marker FGF18, demonstrating the utility of chromatin accessibility data to uncover disease mechanism targets. Our regulatory map and analysis model provide valuable new resources to investigate age-dependent and species-specific control of critical developmental processes. Furthermore, these resources complement existing atlas efforts to advance our understanding of lung health and disease across the human lifespan.

2022-03-15

Multicellular modules as clinical diagnostic and therapeutic targets

Baertsch MA, Nolan GP, Hickey JW

TMC-Stanford

The complex determinants of health and disease can be determined when approached as a system of interactions of biological agents at different scales. Similar to the physicochemical properties that govern nucleic acids and proteins, there should be a finite set of rules that dictate the behavior of cells to form tissues. Thus, the occurrence of disease can be seen as flaws in processes that are governed by rules pertaining to multicellular structures. Multiplexed imaging is a technology that connects information that bridges multiple biological scales (i.e., molecules, cells, and tissues) and enables elucidation of rules associated with the formation of multicellular structures. Uncovering important multicellular structures associated with disease will propel a wave of development of new categories of diagnostics and therapeutics.

2022-03-15

Limited extent and consequences of pancreatic SARS-CoV-2 infection

van der Heide V, Jangra S, Cohen P, Rathnasinghe R, Aslam S, Aydillo T, Geanon D, Handler D, Kelley G, Lee B, Rahman A, Dawson T, Qi J, D'Souza D, Kim-Schulze S, Panzer JK, Caicedo A, Kusmartseva I, Posgai AL, Atkinson MA, Albrecht RA, García-Sastre A, Rosenberg BR, Schotsaert M, Homann D

TMC-Florida

Concerns that infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the etiological agent of coronavirus disease 2019 (COVID-19), may cause new-onset diabetes persist in an evolving research landscape, and precise risk assessment is hampered by, at times, conflicting evidence. Here, leveraging comprehensive single-cell analyses of in vitro SARS-CoV-2-infected human pancreatic islets, we demonstrate that productive infection is strictly dependent on the SARS-CoV-2 entry receptor ACE2 and targets practically all pancreatic cell types. Importantly, the infection remains highly circumscribed and largely non-cytopathic and, despite a high viral burden in infected subsets, promotes only modest cellular perturbations and inflammatory responses. Similar experimental outcomes are also observed after islet infection with endemic coronaviruses. Thus, the limits of pancreatic SARS-CoV-2 infection, even under in vitro conditions of enhanced virus exposure, challenge the proposition that in vivo targeting of β cells by SARS-CoV-2 precipitates new-onset diabetes. Whether restricted pancreatic damage and immunological alterations accrued by COVID-19 increase cumulative diabetes risk, however, remains to be evaluated.

2022-03-18

Machine Learning Classifies Ferroptosis and Apoptosis Cell Death Modalities with TfR1 Immunostaining

Jin J, Schorpp K, Samaga D, Unger K, Hadian K, Stockwell BR

TTD-Columbia/Penn State

Determining cell death mechanisms occurring in patient and animal tissues is a longstanding goal that requires suitable biomarkers and accurate quantification. However, effective methods remain elusive. To develop more powerful and unbiased analytic frameworks, we developed a machine learning approach for automated cell death classification. Image sets were collected of HT-1080 fibrosarcoma cells undergoing ferroptosis or apoptosis and stained with an anti-transferrin receptor 1 (TfR1) antibody, together with nuclear and F-actin staining. Features were extracted using high-content-analysis software, and a classifier was constructed by fitting a multinomial logistic lasso regression model to the data. The prediction accuracy of the classifier within three classes (control, ferroptosis, apoptosis) was 93%. Thus, TfR1 staining, combined with nuclear and F-actin staining, can reliably detect both apoptotic and ferroptotis cells when cell features are analyzed in an unbiased manner using machine learning, providing a method for unbiased analysis of modes of cell death.

2022-03-28

Integrating transcription-factor abundance with chromatin accessibility in human erythroid lineage commitment

Baskar R, Chen AF, Favaro P, Reynolds W, Mueller F, Borges L, Jiang S, Park HS, Kool ET, Greenleaf WJ, Bendall SC

RTI-Stanford

Master transcription factors (TFs) directly regulate present and future cell states by binding DNA regulatory elements and driving gene-expression programs. Their abundance influences epigenetic priming to different cell fates at the chromatin level, especially in the context of differentiation. In order to link TF protein abundance to changes in TF motif accessibility and open chromatin, we developed InTAC-seq, a method for simultaneous quantification of genome-wide chromatin accessibility and intracellular protein abundance in fixed cells. Our method produces high-quality data and is a cost-effective alternative to single-cell techniques. We showcase our method by purifying bone marrow (BM) progenitor cells based on GATA-1 protein levels and establish high GATA-1-expressing BM cells as both epigenetically and functionally similar to erythroid-committed progenitors.

2022-03-30

Spatial-CITE-seq: spatially resolved high-plex protein and whole transcriptome co-mapping

Fan R, Liu Y, DiStasio M, Su G, Asashima H, Enninful A, Qin X, Deng Y, Bordignon P, Cassano M, Tomayko M, Xu M, Halene S, Craft J, Hafler D

TTD-Yale

We present spatial-CITE-seq for high-plex protein and whole transcriptome co-mapping, which was firstly demonstrated for profiling 198 proteins and transcriptome in multiple mouse tissue types. It was then applied to human tissues to measure 283 proteins and transcriptome that revealed spatially distinct germinal center reaction in tonsil and early immune activation in skin at the COVID-19 mRNA vaccine injection site. Spatial-CITE-seq may find a range of applications in biomedical research.

2022-04-01

Stratification of chemotherapy-treated stage III colorectal cancer patients using multiplexed imaging and single-cell analysis of T-cell populations

Stachtea X, Loughrey MB, Salvucci M, Lindner AU, Cho S, McDonough E, Sood A, Graf J, Santamaria-Pang A, Corwin A, Laurent-Puig P, Dasgupta S, Shia J, Owens JR, Abate S, Van Schaeybroeck S, Lawler M, Prehn JHM, Ginty F, Longley DB

RTI-GEscd

Colorectal cancer (CRC) has one of the highest cancer incidences and mortality rates. In stage III, postoperative chemotherapy benefits <20% of patients, while more than 50% will develop distant metastases. Biomarkers for identification of patients at increased risk of disease recurrence following adjuvant chemotherapy are currently lacking. In this study, we assessed immune signatures in the tumor and tumor microenvironment (TME) using an in situ multiplexed immunofluorescence imaging and single-cell analysis technology (Cell DIVETM) and evaluated their correlations with patient outcomes. Tissue microarrays (TMAs) with up to three 1 mm diameter cores per patient were prepared from 117 stage III CRC patients treated with adjuvant fluoropyrimidine/oxaliplatin (FOLFOX) chemotherapy. Single sections underwent multiplexed immunofluorescence staining for immune cell markers (CD45, CD3, CD4, CD8, FOXP3, PD1) and tumor/cell segmentation markers (DAPI, pan-cytokeratin, AE1, NaKATPase, and S6). We used annotations and a probabilistic classification algorithm to build statistical models of immune cell types. Images were also qualitatively assessed independently by a Pathologist as 'high', 'moderate' or 'low', for stromal and total immune cell content. Excellent agreement was found between manual assessment and total automated scores (p < 0.0001). Moreover, compared to single markers, a multi-marker classification of regulatory T cells (Tregs: CD3+/CD4+FOXP3+/PD1-) was significantly associated with disease-free survival (DFS) and overall survival (OS) (p = 0.049 and 0.032) of FOLFOX-treated patients. Our results also showed that PD1- Tregs rather than PD1+ Tregs were associated with improved survival. These findings were supported by results from an independent FOLFOX-treated cohort of 191 stage III CRC patients, where higher PD1- Tregs were associated with an increase overall survival (p = 0.015) for CD3+/CD4+/FOXP3+/PD1-. Overall, compared to single markers, multi-marker classification provided more accurate quantitation of immune cell types with stronger correlations with outcomes.

2022-04-01

Proteomics Standards Initiative's ProForma 2.0: Unifying the Encoding of Proteoforms and Peptidoforms

LeDuc RD, Deutsch EW, Binz PA, Fellers RT, Cesnik AJ, Klein JA, Van Den Bossche T, Gabriels R, Yalavarthi A, Perez-Riverol Y, Carver J, Bittremieux W, Kawano S, Pullman B, Bandeira N, Kelleher NL, Thomas PM, Vizcaíno JA

RTI-Northwestern

It is important for the proteomics community to have a standardized manner to represent all possible variations of a protein or peptide primary sequence, including natural, chemically induced, and artifactual modifications. The Human Proteome Organization Proteomics Standards Initiative in collaboration with several members of the Consortium for Top-Down Proteomics (CTDP) has developed a standard notation called ProForma 2.0, which is a substantial extension of the original ProForma notation developed by the CTDP. ProForma 2.0 aims to unify the representation of proteoforms and peptidoforms. ProForma 2.0 supports use cases needed for bottom-up and middle-/top-down proteomics approaches and allows the encoding of highly modified proteins and peptides using a human- and machine-readable string. ProForma 2.0 can be used to represent protein modifications in a specified or ambiguous location, designated by mass shifts, chemical formulas, or controlled vocabulary terms, including cross-links (natural and chemical) and atomic isotopes. Notational conventions are based on public controlled vocabularies and ontologies. The most up-to-date full specification document and information about software implementations are available at http://psidev.info/proforma.

2022-04-01

Putting Humpty Dumpty Back Together Again: What Does Protein Quantification Mean in Bottom-Up Proteomics?

Plubell DL, Käll L, Webb-Robertson BJ, Bramer LM, Ives A, Kelleher NL, Smith LM, Montine TJ, Wu CC, MacCoss MJ

RTI-Northwestern

Bottom-up proteomics provides peptide measurements and has been invaluable for moving proteomics into large-scale analyses. Commonly, a single quantitative value is reported for each protein-coding gene by aggregating peptide quantities into protein groups following protein inference or parsimony. However, given the complexity of both RNA splicing and post-translational protein modification, it is overly simplistic to assume that all peptides that map to a singular protein-coding gene will demonstrate the same quantitative response. By assuming that all peptides from a protein-coding sequence are representative of the same protein, we may miss the discovery of important biological differences. To capture the contributions of existing proteoforms, we need to reconsider the practice of aggregating protein values to a single quantity per protein-coding gene.

2022-04-01

A complete reference genome improves analysis of human genetic variation

Aganezov S, Yan SM, Soto DC, Kirsche M, Zarate S, Avdeyev P, Taylor DJ, Shafin K, Shumate A, Xiao C, Wagner J, McDaniel J, Olson ND, Sauria MEG, Vollger MR, Rhie A, Meredith M, Martin S, Lee J, Koren S, Rosenfeld JA, Paten B, Layer R, Chin CS, Sedlazeck FJ, Hansen NF, Miller DE, Phillippy AM, Miga KH, McCoy RC, Dennis MY, Zook JM, Schatz MC.

HIVE TC-CMU

Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome for clinical and functional study. We show how this reference universally improves read mapping and variant calling for 3202 and 17 globally diverse samples sequenced with short and long reads, respectively. We identify hundreds of thousands of variants per sample in previously unresolved regions, showcasing the promise of the T2T-CHM13 reference for evolutionary and biomedical discovery. Simultaneously, this reference eliminates tens of thousands of spurious variants per sample, including reduction of false positives in 269 medically relevant genes by up to a factor of 12. Because of these improvements in variant discovery coupled with population and functional genomic resources, T2T-CHM13 is positioned to replace GRCh38 as the prevailing reference for human genetics.

2022-04-01

Analyzing Spatial Transcriptomics Data Using Giotto

Del Rossi N, Chen JG, Yuan GC, Dries R.

TTD-Cal Tech

Spatial transcriptomic technologies have been developed rapidly in recent years. The addition of spatial context to expression data holds the potential to revolutionize many fields in biology. However, the lack of computational tools remains a bottleneck that is preventing the broader utilization of these technologies. Recently, we have developed Giotto as a comprehensive, generally applicable, and user-friendly toolbox for spatial transcriptomic data analysis and visualization. Giotto implements a rich set of algorithms to enable robust spatial data analysis. To help users get familiar with the Giotto environment and apply it effectively in analyzing new datasets, we will describe the detailed protocols for applying Giotto without any advanced programming skills. © 2022 Wiley Periodicals LLC. Basic Protocol 1: Getting Giotto set up for use Basic Protocol 2: Pre-processing Basic Protocol 3: Clustering and cell-type identification Basic Protocol 4: Cell-type enrichment and deconvolution analyses Basic Protocol 5: Spatial structure analysis tools Basic Protocol 6: Spatial domain detection by using a hidden Markov random field model Support Protocol 1: Spatial proximity-associated cell-cell interactions Support Protocol 2: Assembly of a registered 3D Giotto object from 2D slices.

2022-04-01

Complete genomic and epigenetic maps of human centromeres

Altemose N, Logsdon GA, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, Hoyt SJ, Uralsky L, Ryabov FD, Shew CJ, Sauria MEG, Borchers M, Gershman A, Mikheenko A, Shepelev VA, Dvorkina T, Kunyavskaya O, Vollger MR, Rhie A, McCartney AM, Asri M, Lorig-Roach R, Shafin K, Lucas JK, Aganezov S, Olson D, de Lima LG, Potapova T, Hartley GA, Haukness M, Kerpedjiev P, Gusev F, Tigyi K, Brooks S, Young A, Nurk S, Koren S, Salama SR, Paten B, Rogaev EI, Streets A, Karpen GH, Dernburg AF, Sullivan BA, Straight AF, Wheeler TJ, Gerton JL, Eichler EE, Phillippy AM, Timp W, Dennis MY, O'Neill RJ, Zook JM, Schatz MC, Pevzner PA, Diekhans M, Langley CH, Alexandrov IA, Miga KH

HIVE TC-CMU

Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.

2022-04-01

Ductal Carcinoma In Situ of Breast: From Molecular Etiology to Therapeutic Management

Hophan SL, Odnokoz O, Liu H, Luo Y, Khan S, Gradishar W, Zhou Z, Badve S, Torres MA, Wan Y

TTD-PNNL/Northwestern

Ductal carcinoma in situ (DCIS) makes up a majority of noninvasive breast cancer cases. DCIS is a neoplastic proliferation of epithelial cells within the ductal structure of the breast. Currently, there is little known about the progression of DCIS to invasive ductal carcinoma (IDC), or the molecular etiology behind each DCIS lesion or grade. The DCIS lesions can be heterogeneous in morphology, genetics, cellular biology, and clinical behavior, posing challenges to our understanding of the molecular mechanisms by which approximately half of all DCIS lesions progress to an invasive status. New strategies that pinpoint molecular mechanisms are necessary to overcome this gap in understanding, which is a barrier to more targeted therapy. In this review, we will discuss the etiological factors associated with DCIS, as well as the complexity of each nuclear grade lesion. Moreover, we will discuss the possible molecular features that lead to progression of DCIS to IDC. We will highlight current therapeutic management and areas for improvement.

2022-04-04

Supervised Deep Generation of High-Resolution Arterial Phase Computed Tomography Kidney Substructure Atlas

Lee HH, Tang Y, Bao S, Yang Q, Xu X, Fogo AB, Harris R, de Caestecker MP, Spraggins JM, Heinrich M, Huo Y, Landman BA

TMC-Vanderbilt (Kidney)

The Human BioMolecular Atlas Program (HuBMAP) provides an opportunity to contextualize findings across cellular to organ systems levels. Constructing an atlas target is the primary endpoint for generalizing anatomical information across scales and populations. An initial target of HuBMAP is the kidney organ and arterial phase contrast-enhanced computed tomography (CT) provides distinctive appearance and anatomical context on the internal substructure of kidney organs such as renal context, medulla, and pelvicalyceal system. With the confounding effects of demographics and morphological characteristics of the kidney across large-scale imaging surveys, substantial variation is demonstrated with the internal substructure morphometry and the intensity contrast due to the variance of imaging protocols. Such variability increases the level of difficulty to localize the anatomical features of the kidney substructure in a well-defined spatial reference for clinical analysis. In order to stabilize the localization of kidney substructures in the context of this variability, we propose a high-resolution CT kidney substructure atlas template. Briefly, we introduce a deep learning preprocessing technique to extract the volumetric interest of the abdominal regions and further perform a deep supervised registration pipeline to stably adapt the anatomical context of the kidney internal substructure. To generate and evaluate the atlas template, arterial phase CT scans of 500 control subjects are de-identified and registered to the atlas template with a complete end-to-end pipeline. With stable registration to the abdominal wall and kidney organs, the internal substructure of both left and right kidneys are substantially localized in the high-resolution atlas space. The atlas average template successfully demonstrated the contextual details of the internal structure and was applicable to generalize the morphological variation of internal substructure across patients.

2022-04-06

Concerted modification of nucleotides at functional centers of the ribosome revealed by single-molecule RNA modification profiling

Bailey AD 4th, Talkish J, Ding H, Igel HA, Duran A, Mantripragada S, Paten B, Ares M Jr

HIVE TC-CMU

Nucleotides in RNA and DNA are chemically modified by numerous enzymes that alter their function. Eukaryotic ribosomal RNA (rRNA) is modified at more than 100 locations, particularly at highly conserved and functionally important nucleotides. During ribosome biogenesis, modifications are added at various stages of assembly. The existence of differently modified classes of ribosomes in normal cells is unknown because no method exists to simultaneously evaluate the modification status at all sites within a single rRNA molecule. Using a combination of yeast genetics and nanopore direct RNA sequencing, we developed a reliable method to track the modification status of single rRNA molecules at 37 sites in 18S rRNA and 73 sites in 25S rRNA. We use our method to characterize patterns of modification heterogeneity and identify concerted modification of nucleotides found near functional centers of the ribosome. Distinct, undermodified subpopulations of rRNAs accumulate upon loss of Dbp3 or Prp43 RNA helicases, suggesting overlapping roles in ribosome biogenesis. Modification profiles are surprisingly resistant to change in response to many genetic and acute environmental conditions that affect translation, ribosome biogenesis, and pre-mRNA splicing. The ability to capture single molecule RNA modification profiles provides new insights into the roles of nucleotide modifications in RNA function.

2022-04-12

Mapping the Proteoform Landscape of Five Human Tissues

Drown BS, Jooß K, Melani RD, Lloyd-Jones C, Camarillo JM, Kelleher NL

RTI-Northwestern

A functional understanding of the human body requires structure-function studies of proteins at scale. The chemical structure of proteins is controlled at the transcriptional, translational, and post-translational levels, creating a variety of products with modulated functions within the cell. The term "proteoform" encapsulates this complexity at the level of chemical composition. Comprehensive mapping of the proteoform landscape in human tissues necessitates analytical techniques with increased sensitivity and depth of coverage. Here, we took a top-down proteomics approach, combining data generated using capillary zone electrophoresis (CZE) and nanoflow reversed-phase liquid chromatography (RPLC) hyphenated to mass spectrometry to identify and characterize proteoforms from the human lungs, heart, spleen, small intestine, and kidneys. CZE and RPLC provided complementary post-translational modification and proteoform selectivity, thereby enhancing the overall proteome coverage when used in combination. Of the 11,466 proteoforms identified in this study, 7373 (64%) were not reported previously. Large differences in the protein and proteoform level were readily quantified, with initial inferences about proteoform biology operative in the analyzed organs. Differential proteoform regulation of defensins, glutathione transferases, and sarcomeric proteins across tissues generate hypotheses about how they function and are regulated in human health and disease.

2022-04-12

Referenced Kendrick Mass Defect Annotation and Class-Based Filtering of Imaging MS Lipidomics Experiments

Richardson LT, Neumann EK, Caprioli RM, Spraggins JM, Solouki T

TMC-Vanderbilt (Kidney)

Because of their diverse functionalities in cells, lipids are of primary importance when characterizing molecular profiles of physiological and disease states. Imaging mass spectrometry (IMS) provides the spatial distributions of lipid populations in tissues. Referenced Kendrick mass defect (RKMD) analysis is an effective mass spectrometry (MS) data analysis tool for classification and annotation of lipids. Herein, we extend the capabilities of RKMD analysis and demonstrate an integrated method for lipid annotation and chemical structure-based filtering for IMS datasets. Annotation of lipid features with lipid molecular class, radyl carbon chain length, and degree of unsaturation allows image reconstruction and visualization based on each structural characteristic. We show a proof-of-concept application of the method to a computationally generated IMS dataset and validate that the RKMD method is highly specific for lipid components in the presence of confounding background ions. Moreover, we demonstrate an application of the RKMD-based annotation and filtering to matrix-assisted laser desorption/ionization (MALDI) IMS lipidomic data from human kidney tissue analysis.

2022-04-13

Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning

Greenwald NF, Miller G, Moen E, Kong A, Kagel A, Dougherty T, Fullaway CC, McIntosh BJ, Leow KX, Schwartz MS, Pavelchek C, Cui S, Camplisson I, Bar-Tal O, Singh J, Fong M, Chaudhry G, Abraham Z, Moseley J, Warshawsky S, Soon E, Greenbaum S, Risom T, Hollmann T, Bendall SC, Keren L, Graf W, Angelo M, Van Valen D

RTI-Stanford

A principal challenge in the analysis of tissue imaging data is cell segmentation-the task of identifying the precise boundary of every cell in an image. To address this problem we constructed TissueNet, a dataset for training segmentation models that contains more than 1 million manually labeled cells, an order of magnitude more than all previously published segmentation training datasets. We used TissueNet to train Mesmer, a deep-learning-enabled segmentation algorithm. We demonstrated that Mesmer is more accurate than previous methods, generalizes to the full diversity of tissue types and imaging platforms in TissueNet, and achieves human-level performance. Mesmer enabled the automated extraction of key cellular features, such as subcellular localization of protein signal, which was challenging with previous approaches. We then adapted Mesmer to harness cell lineage information in highly multiplexed datasets and used this enhanced version to quantify cell morphology changes during human gestation. All code, data and models are released as a community resource.

2022-04-14

Membrane marker selection for segmenting single cell spatial proteomics data

Dayao MT, Brusko M, Wasserfall C, Bar-Joseph Z

HIVE TC-CMU

The ability to profile spatial proteomics at the single cell level enables the study of cell types, their spatial distribution, and interactions in several tissues and conditions. Current methods for cell segmentation in such studies rely on known membrane or cell boundary markers. However, for many tissues, an optimal set of markers is not known, and even within a tissue, different cell types may express different markers. Here we present RAMCES, a method that uses a convolutional neural network to learn the optimal markers for a new sample and outputs a weighted combination of the selected markers for segmentation. Testing RAMCES on several existing datasets indicates that it correctly identifies cell boundary markers, improving on methods that rely on a single marker or those that extend nuclei segmentations. Application to new spatial proteomics data demonstrates its usefulness for accurately assigning cell types based on the proteins expressed in segmented cells.

2022-04-14

Interactive single-cell data analysis using Cellar

Hasanaj E, Wang J, Sarathi A, Ding J, Bar-Joseph Z

HIVE TC-CMU

Cell type assignment is a major challenge for all types of high throughput single cell data. In many cases such assignment requires the repeated manual use of external and complementary data sources. To improve the ability to uniformly assign cell types across large consortia, platforms and modalities, we developed Cellar, a software tool that provides interactive support to all the different steps involved in the assignment and dataset comparison process. We discuss the different methods implemented by Cellar, how these can be used with different data types, how to combine complementary data types and how to analyze and visualize spatial data. We demonstrate the advantages of Cellar by using it to annotate several HuBMAP datasets from multi-omics single-cell sequencing and spatial proteomics studies. Cellar is open-source and includes several annotated HuBMAP datasets.

2022-05-03

Crowdsourced RNA design discovers diverse, reversible, efficient, self-contained molecular switches

Andreasson JOL, Gotrik MR, Wu MJ, Wayment-Steele HK, Kladwang W, Portela F, Wellington-Oguri R; Eterna Participants, Das R, Greenleaf WJ

TMC-Stanford

Significance: Our manuscript presents a paradigm for carrying out distributed science. We have harnessed an online RNA design game, Eterna, to challenge a large community of RNA designers to create diverse RNA sensors. RNA is an attractive, biocompatible substrate for the design and implementation of molecular sensors. We tasked the diverse Eterna community, comprising a global network of molecular design enthusiasts, to submit thousands to tens of thousands of "solutions" to these RNA sensor design challenges. Crucially, community designs were synthesized and tested experimentally in the real world using high-throughput methods for biochemical assays built on repurposed DNA sequencers. The best player-generated designs for RNA sensors approached the thermodynamic optimum.

2022-05-03

PHGDH expression increases with progression of Alzheimer’s disease pathology and symptoms

Chen X, Calandrelli R, Girardini J, Yan Z, Tan Z, Xu X, Hiniker A, Zhong S

TTD-UCSD/City of Hope

Chen et al. reveal an increase of phosphoglycerate dehydrogenase (PHGDH) mRNA and protein levels in two mouse models and four human cohorts in Alzheimer's disease brains compared to age- and sex-matched control brains. The increase of PHGDH expression in human brain correlates with symptomatic development and disease pathology.

2022-05-11

Viv: multiscale visualization of high-resolution multiplexed bioimaging data on the web

Manz T, Gold I, Patterson NH, McCallum C, Keller MS, Herr BW 2nd, Börner K, Spraggins JM, Gehlenborg N

HIVE TC-Harvard

2022-05-13

Nuclear oligo hashing improves differential analysis of single-cell RNA-seq

Kim HJ, Booth G, Saunders L, Srivatsan S, McFaline-Figueroa JL, Trapnell C

TMC-Cal Tech

Single-cell RNA sequencing (scRNA-seq) offers a high-resolution molecular view into complex tissues, but suffers from high levels of technical noise which frustrates efforts to compare the gene expression programs of different cell types. "Spike-in" RNA standards help control for technical variation in scRNA-seq, but using them with recently developed, ultra-scalable scRNA-seq methods based on combinatorial indexing is not feasible. Here, we describe a simple and cost-effective method for normalizing transcript counts and subtracting technical variability that improves differential expression analysis in scRNA-seq. The method affixes a ladder of synthetic single-stranded DNA oligos to each cell that appears in its RNA-seq library. With improved normalization we explore chemical perturbations with broad or highly specific effects on gene regulation, including RNA pol II elongation, histone deacetylation, and activation of the glucocorticoid receptor. Our methods reveal that inhibiting histone deacetylation prevents cells from executing their canonical program of changes following glucocorticoid stimulation.

2022-05-31

NEAT-seq: simultaneous profiling of intra-nuclear proteins, chromatin accessibility and gene expression in single cells

Chen AF, Parks B, Kathiria AS, Ober-Reynolds B, Goronzy JJ, Greenleaf WJ

TMC-Stanford

In this work, we describe NEAT-seq (sequencing of nuclear protein epitope abundance, chromatin accessibility and the transcriptome in single cells), enabling interrogation of regulatory mechanisms spanning the central dogma. We apply this technique to profile CD4 memory T cells using a panel of master transcription factors (TFs) that drive T cell subsets and identify examples of TFs with regulatory activity gated by transcription, translation and regulation of chromatin binding. We also link a noncoding genome-wide association study single-nucleotide polymorphism (SNP) within a GATA motif to a putative target gene, using NEAT-seq data to internally validate SNP impact on GATA3 regulation.

2022-06-01

Revealing new biology from multiplexed, metal-isotope-tagged, single-cell readouts

Baskar R, Kimmey SC, Bendall SC

RTI-Stanford

Mass cytometry (MC) is a recent technology that pairs plasma-based ionization of cells in suspension with time-of-flight (TOF) mass spectrometry to sensitively quantify the single-cell abundance of metal-isotope-tagged affinity reagents to key proteins, RNA, and peptides. Given the ability to multiplex readouts (~50 per cell) and capture millions of cells per experiment, MC offers a robust way to assay rare, transitional cell states that are pertinent to human development and disease. Here, we review MC approaches that let us probe the dynamics of cellular regulation across multiple conditions and sample types in a single experiment. Additionally, we discuss current limitations and future extensions of MC as well as computational tools commonly used to extract biological insight from single-cell proteomic datasets.

2022-06-06

Cell Trafficking at the Intersection of the Tumor-Immune Compartments

Du W, Nair P, Johnston A, Wu PH, Wirtz D

TMC-JHU

Migration is an essential cellular process that regulates human organ development and homeostasis as well as disease initiation and progression. In cancer, immune and tumor cell migration is strongly associated with immune cell infiltration, immune escape, and tumor cell metastasis, which ultimately account for more than 90% of cancer deaths. The biophysics and molecular regulation of the migration of cancer and immune cells have been extensively studied separately. However, accumulating evidence indicates that, in the tumor microenvironment, the motilities of immune and cancer cells are highly interdependent via secreted factors such as cytokines and chemokines. Tumor and immune cells constantly express these soluble factors, which produce a tightly intertwined regulatory network for these cells' respective migration. A mechanistic understanding of the reciprocal regulation of soluble factor-mediated cell migration can provide critical information for the development of new biomarkers of tumor progression and of tumor response to immuno-oncological treatments. We review the biophysical andbiomolecular basis for the migration of immune and tumor cells and their associated reciprocal regulatory network. We also describe ongoing attempts to translate this knowledge into the clinic.

2022-06-10

A reference tissue atlas for the human kidney

Hansen J, Sealfon R, Menon R, Eadon MT, Lake BB, Steck B, Anjani K, Parikh S, Sigdel TK, Zhang G, Velickovic D, Barwinska D, Alexandrov T, Dobi D, Rashmi P, Otto EA, Rivera M, Rose MP, Anderton CR, Shapiro JP, Pamreddy A, Winfree S, Xiong Y, He Y, de Boer IH, Hodgin JB, Barisoni L, Naik AS, Sharma K, Sarwal MM, Zhang K, Himmelfarb J, Rovin B, El-Achkar TM, Laszik Z, He JC, Dagher PC, Valerius MT, Jain S, Satlin LM, Troyanskaya OG, Kretzler M, Iyengar R, Azeloglu EU; Kidney Precision Medicine Project

TMC-UCSD

Kidney Precision Medicine Project (KPMP) is building a spatially specified human kidney tissue atlas in health and disease with single-cell resolution. Here, we describe the construction of an integrated reference map of cells, pathways, and genes using unaffected regions of nephrectomy tissues and undiseased human biopsies from 56 adult subjects. We use single-cell/nucleus transcriptomics, subsegmental laser microdissection transcriptomics and proteomics, near-single-cell proteomics, 3D and CODEX imaging, and spatial metabolomics to hierarchically identify genes, pathways, and cells. Integrated data from these different technologies coherently identify cell types/subtypes within different nephron segments and the interstitium. These profiles describe cell-level functional organization of the kidney following its physiological functions and link cell subtypes to genes, proteins, metabolites, and pathways. They further show that messenger RNA levels along the nephron are congruent with the subsegmental physiological activity. This reference atlas provides a framework for the classification of kidney disease when multiple molecular mechanisms underlie convergent clinical phenotypes.

2022-06-14

Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies

Mc Cartney AM, Shafin K, Alonge M, Bzikadze AV, Formenti G, Fungtammasan A, Howe K, Jain C, Koren S, Logsdon GA, Miga KH, Mikheenko A, Paten B, Shumate A, Soto DC, Sović I, Wood JMD, Zook JM, Phillippy AM, Rhie A

HIVE TC-CMU

Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies.

2022-06-14

Merfin: improved variant filtering, assembly evaluation and polishing via k-mer validation

Formenti G, Rhie A, Walenz BP, Thibaud-Nissen F, Shafin K, Koren S, Myers EW, Jarvis ED, Phillippy AM

HIVE TC-CMU

Variant calling has been widely used for genotyping and for improving the consensus accuracy of long-read assemblies. Variant calls are commonly hard-filtered with user-defined cutoffs. However, it is impossible to define a single set of optimal cutoffs, as the calls heavily depend on the quality of the reads, the variant caller of choice and the quality of the unpolished assembly. Here, we introduce Merfin, a k-mer based variant-filtering algorithm for improved accuracy in genotyping and genome assembly polishing. Merfin evaluates each variant based on the expected k-mer multiplicity in the reads, independently of the quality of the read alignment and variant caller's internal score. Merfin increased the precision of genotyped calls in several benchmarks, improved consensus accuracy and reduced frameshift errors when applied to human and nonhuman assemblies built from Pacific Biosciences HiFi and continuous long reads or Oxford Nanopore reads, including the first complete human genome. Moreover, we introduce assembly quality and completeness metrics that account for the expected genomic copy numbers.

2022-06-20

Single-cell analyses define a continuum of cell state and composition changes in the malignant transformation of polyps to colorectal cancer

Becker WR, Nevins SA, Chen DC, Chiu R, Horning AM, Guha TK, Laquindanum R, Mills M, Chaib H, Ladabaum U, Longacre T, Shen J, Esplin ED, Kundaje A, Ford JM, Curtis C, Snyder MP, Greenleaf WJ

TMC-Stanford

To chart cell composition and cell state changes that occur during the transformation of healthy colon to precancerous adenomas to colorectal cancer (CRC), we generated single-cell chromatin accessibility profiles and single-cell transcriptomes from 1,000 to 10,000 cells per sample for 48 polyps, 27 normal tissues and 6 CRCs collected from patients with or without germline APC mutations. A large fraction of polyp and CRC cells exhibit a stem-like phenotype, and we define a continuum of epigenetic and transcriptional changes occurring in these stem-like cells as they progress from homeostasis to CRC. Advanced polyps contain increasing numbers of stem-like cells, regulatory T cells and a subtype of pre-cancer-associated fibroblasts. In the cancerous state, we observe T cell exhaustion, RUNX1-regulated cancer-associated fibroblasts and increasing accessibility associated with HNF4A motifs in epithelia. DNA methylation changes in sporadic CRC are strongly anti-correlated with accessibility changes along this continuum, further identifying regulatory markers for molecular staging of polyps.

2022-06-30

Integrated single cell sequencing and histopathological analyses reveal diverse injury and repair responses in a participant with acute kidney injury: A clinical-molecular-pathologic correlation.

Menon R, Bomback AS, Lake BB, Stutzke C, Grewenow SM, Menez S, D'Agati VD, Jain S

TMC-UCSD

2022-06-30

Temporal modelling using single-cell transcriptomics

Ding J, Sharon N, Bar-Joseph Z

HIVE TC-CMU

Methods for profiling genes at the single-cell level have revolutionized our ability to study several biological processes and systems including development, differentiation, response programmes and disease progression. In many of these studies, cells are profiled over time in order to infer dynamic changes in cell states and types, sets of expressed genes, active pathways and key regulators. However, time-series single-cell RNA sequencing (scRNA-seq) also raises several new analysis and modelling issues. These issues range from determining when and how deep to profile cells, linking cells within and between time points, learning continuous trajectories, and integrating bulk and single-cell data for reconstructing models of dynamic networks. In this Review, we discuss several approaches for the analysis and modelling of time-series scRNA-seq, highlighting their steps, key assumptions, and the types of data and biological questions they are most appropriate for.

2022-06-30

Deep learning identification of stiffness markers in breast cancer

Sneider A, Kiemen A, Kim JH, Wu PH, Habibi M, White M, Phillip JM, Gu L, Wirtz D

TMC-JHU

While essential to our understanding of solid tumor progression, the study of cell and tissue mechanics has yet to find traction in the clinic. Determining tissue stiffness, a mechanical property known to promote a malignant phenotype in vitro and in vivo, is not part of the standard algorithm for the diagnosis and treatment of breast cancer. Instead, clinicians routinely use mammograms to identify malignant lesions and radiographically dense breast tissue is associated with an increased risk of developing cancer. Whether breast density is related to tumor tissue stiffness, and what cellular and non-cellular components of the tumor contribute the most to its stiffness are not well understood. Through training of a deep learning network and mechanical measurements of fresh patient tissue, we create a bridge in understanding between clinical and mechanical markers. The automatic identification of cellular and extracellular features from hematoxylin and eosin (H&E)-stained slides reveals that global and local breast tissue stiffness best correlate with the percentage of straight collagen. Importantly, the percentage of dense breast tissue does not directly correlate with tissue stiffness or straight collagen content.

2022-07-07

scSTEM: clustering pseudotime ordered single-cell data

Song Q, Wang J, Bar-Joseph Z

HIVE TC-CMU

We develop scSTEM, single-cell STEM, a method for clustering dynamic profiles of genes in trajectories inferred from pseudotime ordering of single-cell RNA-seq (scRNA-seq) data. scSTEM uses one of several metrics to summarize the expression of genes and assigns a p-value to clusters enabling the identification of significant profiles and comparison of profiles across different paths. Application of scSTEM to several scRNA-seq datasets demonstrates its usefulness and ability to improve downstream analysis of biological processes. scSTEM is available at https://github.com/alexQiSong/scSTEM .

2022-07-07

Ferroptosis turns 10: Emerging mechanisms, physiological functions, and therapeutic applications

Stockwell BR

TTD-Columbia/Penn State

Ferroptosis, a form of cell death driven by iron-dependent lipid peroxidation, was identified as a distinct phenomenon and named a decade ago. Ferroptosis has been implicated in a broad set of biological contexts, from development to aging, immunity, and cancer. This review describes key regulators of this form of cell death within a framework of metabolism, ROS biology, and iron biology. Key concepts and major unanswered questions in the ferroptosis field are highlighted. The next decade promises to yield further breakthroughs in the mechanisms governing ferroptosis and additional ways of harnessing ferroptosis for therapeutic benefit.

2022-07-11

Reproducible, high-dimensional imaging in archival human tissue by multiplexed ion beam imaging by time-of-flight (MIBI-TOF)

Liu CC, Bosse M, Kong A, Kagel A, Kinders R, Hewitt SM, Varma S, van de Rijn M, Nowak SH, Bendall SC, Angelo M

RTI-Stanford

Multiplexed ion beam imaging by time-of-flight (MIBI-TOF) is a form of mass spectrometry imaging that uses metal labeled antibodies and secondary ion mass spectrometry to image dozens of proteins simultaneously in the same tissue section. Working with the National Cancer Institute's (NCI) Cancer Immune Monitoring and Analysis Centers (CIMAC), we undertook a validation study, assessing concordance across a dozen serial sections of a tissue microarray of 21 samples that were independently processed and imaged by MIBI-TOF or single-plex immunohistochemistry (IHC) over 12 days. Pixel-level features were highly concordant across all 16 targets assessed in both staining intensity (R2 = 0.94 ± 0.04) and frequency (R2 = 0.95 ± 0.04). Comparison to digitized, single-plex IHC on adjacent serial sections revealed similar concordance (R2 = 0.85 ± 0.08) as well. Lastly, automated segmentation and clustering of eight cell populations found that cell frequencies between serial sections yielded an average correlation of R2 = 0.94 ± 0.05. Taken together, we demonstrate that MIBI-TOF, with well-vetted reagents and automated analysis, can generate consistent and quantitative annotations of clinically relevant cell states in archival human tissue, and more broadly, present a scalable framework for benchmarking multiplexed IHC approaches.

2022-07-12

High-Throughput Nano-DESI Mass Spectrometry Imaging of Biological Tissues Using an Integrated Microfluidic Probe

Li X, Hu H, Yin R, Li Y, Sun X, Dey SK, Laskin J

TTD-Purdue

Nanospray desorption electrospray mass spectrometry imaging (nano-DESI MSI) enables quantitative mapping of hundreds of molecules in biological samples with minimal sample pretreatment. We have recently developed an integrated microfluidic probe (iMFP) for nano-DESI MSI. Herein, we describe an improved design of the iMFP for the high-throughput imaging of tissue sections. We increased the dimensions of the primary and spray channels and optimized the spray voltage and solvent flow rate to obtain a stable operation of the iMFP at both low and high scan rates. We observe that the sensitivity, molecular coverage, and spatial resolution obtained using the iMFP do not change to a significant extent as the scan rate increases. Using a scan rate of 0.4 mm/s, we obtained high-quality images of mouse uterine tissue sections (scan area: 3.2 mm × 2.3 mm) in only 9.5 min and of mouse brain tissue (scan area: 7.0 mm × 5.4 mm) in 21.7 min, which corresponds to a 10-15-fold improvement in the experimental throughput. We have also developed a quantitative metric for evaluating the quality of ion images obtained at different scan rates. Using this metric, we demonstrate that the quality of nano-DESI MSI data does not degrade substantially with an increase in the scan rate. The ability to image biological tissues with high throughput using iMFP-based nano-DESI MSI will substantially speed up tissue mapping efforts.

2022-07-18

Proteoform-Selective Imaging of Tissues Using Mass Spectrometry

Yang M, Hu H, Su P, Thomas PM, Camarillo JM, Greer JB, Early BP, Fellers RT, Kelleher NL, Laskin J

TTD-Purdue

Unraveling the complexity of biological systems relies on the development of new approaches for spatially resolved proteoform-specific analysis of the proteome. Herein, we employ nanospray desorption electrospray ionization mass spectrometry imaging (nano-DESI MSI) for the proteoform-selective imaging of biological tissues. Nano-DESI generates multiply charged protein ions, which is advantageous for their structural characterization using tandem mass spectrometry (MS/MS) directly on the tissue. Proof-of-concept experiments demonstrate that nano-DESI MSI combined with on-tissue top-down proteomics is ideally suited for the proteoform-selective imaging of tissue sections. Using rat brain tissue as a model system, we provide the first evidence of differential proteoform expression in different regions of the brain.

2022-07-25

Accelerated identification of disease-causing variants with ultra-rapid nanopore genome sequencing

Goenka SD, Gorzynski JE, Shafin K, Fisk DG, Pesout T, Jensen TD, Monlong J, Chang PC, Baid G, Bernstein JA, Christle JW, Dalton KP, Garalde DR, Grove ME, Guillory J, Kolesnikov A, Nattestad M, Ruzhnikov MRZ, Samadi M, Sethia A, Spiteri E, Wright CJ, Xiong K, Zhu T, Jain M, Sedlazeck FJ, Carroll A, Paten B, Ashley EA

HIVE TC-CMU

Whole-genome sequencing (WGS) can identify variants that cause genetic disease, but the time required for sequencing and analysis has been a barrier to its use in acutely ill patients. In the present study, we develop an approach for ultra-rapid nanopore WGS that combines an optimized sample preparation protocol, distributing sequencing over 48 flow cells, near real-time base calling and alignment, accelerated variant calling and fast variant filtration for efficient manual review. Application to two example clinical cases identified a candidate variant in <8 h from sample preparation to variant identification. We show that this framework provides accurate variant calls and efficient prioritization, and accelerates diagnostic clinical genome sequencing twofold compared with previous approaches.

2022-07-26

Hanging drop sample preparation improves sensitivity of spatial proteomics

Kwon Y, Piehowski PD, Zhao R, Sontag RL, Moore RJ, Burnum-Johnson KE, Smith RD, Qian WJ, Kelly RT, Zhu Y

TMC-PNNL

Spatial proteomics holds great promise for revealing tissue heterogeneity in both physiological and pathological conditions. However, one significant limitation of most spatial proteomics workflows is the requirement of large sample amounts that blurs cell-type-specific or microstructure-specific information. In this study, we developed an improved sample preparation approach for spatial proteomics and integrated it with our previously-established laser capture microdissection (LCM) and microfluidics sample processing platform. Specifically, we developed a hanging drop (HD) method to improve the sample recovery by positioning a nanowell chip upside-down during protein extraction and tryptic digestion steps. Compared with the commonly-used sitting-drop method, the HD method keeps the tissue pixel away from the container surface, and thus improves the accessibility of the extraction/digestion buffer to the tissue sample. The HD method can increase the MS signal by 7 fold, leading to a 66% increase in the number of identified proteins. An average of 721, 1489, and 2521 proteins can be quantitatively profiled from laser-dissected 10 μm-thick mouse liver tissue pixels with areas of 0.0025, 0.01, and 0.04 mm², respectively. The improved system was further validated in the study of cell-type-specific proteomes of mouse uterine tissues.

2022-07-29

New Views of Old Proteins: Clarifying the Enigmatic Proteome

Burnum-Johnson KE, Conrads TP, Drake RR, Herr AE, Iyengar R, Kelly RT, Lundberg E, MacCoss MJ, Naba A, Nolan GP, Pevzner PA, Rodland KD, Sechi S, Slavov N, Spraggins JM, Van Eyk JE, Vidal M, Vogel C, Walt DR, Kelleher NL

TMC-Stanford

All human diseases involve proteins, yet our current tools to characterize and quantify them are limited. To better elucidate proteins across space, time, and molecular composition, we provide a >10 years of projection for technologies to meet the challenges that protein biology presents. With a broad perspective, we discuss grand opportunities to transition the science of proteomics into a more propulsive enterprise. Extrapolating recent trends, we describe a next generation of approaches to define, quantify, and visualize the multiple dimensions of the proteome, thereby transforming our understanding and interactions with human disease in the coming decade.

2022-07-31

Clinical Implementation and Initial Experience With a 1.5 Tesla MR-Linac for MR-Guided Radiation Therapy for Gynecologic Cancer: An R-IDEAL Stage 1 and 2a First in Humans Feasibility Study of New Technology Implementation

Lakomy DS, Yang J, Vedam S, Wang J, Lee B, Sobremonte A, Castillo P, Hughes N, Mohammedsaid M, Jhingran A, Klopp AH, Choi S, Fuller CD, Lin LL

HIVE IEC-PSC

Purpose: Magnetic resonance imaging-guided linear accelerator systems (MR-linacs) can facilitate the daily adaptation of radiation therapy plans. Here, we report our early clinical experience using a MR-linac for adaptive radiation therapy of gynecologic malignancies. Methods and materials: Treatments were planned with an Elekta Monaco v5.4.01 and delivered by a 1.5 Tesla Elekta Unity MR-linac. The system offers a choice of daily adaptation based on either position (ATP) or shape (ATS) of the tumor and surrounding normal structures. The ATS approach has the option of manually editing the contours of tumors and surrounding normal structures before the plan is adapted. Here, we documented the duration of each treatment fraction; set-up variability (assessed by isocenter shifts in each plan) between fractions; and, for quality assurance, calculated the percentage of plans meeting the γ-criterion of 3%/3-mm distance to agreement. Deformable accumulated dose calculations were used to compare accumulated versus planned dose for patient treated with exclusively ATP fractions. Results: Of the 10 patients treated with 90 fractions on the MR-linac, most received boost doses to recurrence in nodes or isolated tumors. Each treatment fraction lasted a median 32 minutes; fractions were shorter with ATP than with ATS (30 min vs 42 min, P < .0001). The γ criterion for all fraction plans exceeded >90% (median, 99.9%; range, 92.4%-100%; ie, all plans passed quality assurance testing). The average extent of isocenter shift was <0.5 cm in each axis. The accumulated dose to the gross tumor volume was within 5% of the reference plan for all ATP cases. Accumulated doses for lesions in the pelvic periphery were within <1% of the reference plan as opposed to -1.6% to -4.4% for central pelvic tumors. Conclusions: The MR-linac is a reliable and clinically feasible tool for treating patients with gynecologic cancer.

2022-07-31

Multi-contrast computed tomography healthy kidney atlas

Lee HH, Tang Y, Xu K, Bao S, Fogo AB, Harris R, de Caestecker MP, Heinrich M, Spraggins JM, Huo Y, Landman BA

TMC-Vanderbilt (Kidney)

The construction of three-dimensional multi-modal tissue maps provides an opportunity to spur interdisciplinary innovations across temporal and spatial scales through information integration. While the preponderance of effort is allocated to the cellular level and explore the changes in cell interactions and organizations, contextualizing findings within organs and systems is essential to visualize and interpret higher resolution linkage across scales. There is a substantial normal variation of kidney morphometry and appearance across body size, sex, and imaging protocols in abdominal computed tomography (CT). A volumetric atlas framework is needed to integrate and visualize the variability across scales. However, there is no abdominal and retroperitoneal organs atlas framework for multi-contrast CT. Hence, we proposed a high-resolution CT retroperitoneal atlas specifically optimized for the kidney organ across non-contrast CT and early arterial, late arterial, venous and delayed contrast-enhanced CT. We introduce a deep learning-based volume interest extraction method by localizing the 2D slices with a representative score and crop within the range of the abdominal interest. An automated two-stage hierarchal registration pipeline is then performed to register abdominal volumes to a high-resolution CT atlas template with DEEDS affine and non-rigid registration. To generate and evaluate the atlas framework, multi-contrast modality CT scans of 500 subjects (without reported history of renal disease, age: 15-50 years, 250 males & 250 females) were processed. PDD-Net with affine registration achieved the best overall mean DICE for portal venous phase multi-organs label transfer with the registration pipeline (0.540 ± 0.275, p < 0.0001 Wilcoxon signed-rank test) comparing to the other registration tools. It also demonstrated the best performance with the median DICE over 0.8 in transferring the kidney information to the atlas space. DEEDS perform constantly with stable transferring performance in all phases average mapping including significant clear boundary of kidneys with contrastive characteristics, while PDD-Net only demonstrates a stable kidney registration in the average mapping of early and late arterial, and portal venous phase. The variance mappings demonstrate the low intensity variance in the kidney regions with DEEDS across all contrast phases and with PDD-Net across late arterial and portal venous phase. We demonstrate a stable generalizability of the atlas template for integrating the normal kidney variation from small to large, across contrast modalities and populations with great variability of demographics. The linkage of atlas and demographics provided a better understanding of the variation of kidney anatomy across populations.

2022-08-01

Unilateral Radiation Therapy for Tonsillar Cancer: Treatment Outcomes in the Era of Human Papillomavirus, Positron-Emission Tomography, and Intensity Modulated Radiation Therapy

Taku N, Chronowski G, Brandon Gunn G, Morrison WH, Gross ND, Moreno AC, Ferrarotto R, Frank SJ, Fuller CD, Goepfert RP, Phan J, Lai SY, Reddy JP, Rosenthal DI, Garden AS

HIVE IEC-PSC

Purpose: The goal of this study was to evaluate disease, survival, and toxic effects after unilateral radiation therapy treatment for tonsillar cancer. Methods and materials: A retrospective study was performed of patients treated at our institution within the period from 2000 to 2018. Summary statistics were used to assess the cohort by patient characteristics and treatments delivered. The Kaplan-Meier method was used to determine survival outcomes. Results: The cohort comprised 403 patients, including 343 (85%) with clinical and/or radiographic evidence of ipsilateral cervical nodal disease and 181 (45%) with multiple involved nodes. Human papillomavirus was detected in 294 (73%) tumors. Median follow-up time was 5.8 years. Disease relapse was infrequent with local recurrence in 9 (2%) patients, neck recurrence in 13 (3%) patients, and recurrence in the unirradiated contralateral neck in 9 (2%) patients. Five- and 10-year overall survival rates were 94% and 89%, respectively. Gastrostomy tubes were needed in 32 (9%) patients, and no patient had a feeding tube 6 months after therapy. Conclusions: For patients with well-lateralized tonsillar tumors and no clinically evident adenopathy of the contralateral neck, unilateral radiation therapy offers favorable rates of disease outcomes and a relatively low toxicity profile.

2022-08-10

Highly multiplexed, label-free proteoform imaging of tissues by individual ion mass spectrometry

Su P, McGee JP, Durbin KR, Hollas MAR, Yang M, Neumann EK, Allen JL, Drown BS, Butun FA, Greer JB, Early BP, Fellers RT, Spraggins JM, Laskin J, Camarillo JM, Kafader JO, Kelleher NL

Consortium

Imaging of proteoforms in human tissues is hindered by low molecular specificity and limited proteome coverage. Here, we introduce proteoform imaging mass spectrometry (PiMS), which increases the size limit for proteoform detection and identification by fourfold compared to reported methods and reveals tissue localization of proteoforms at <80-μm spatial resolution. PiMS advances proteoform imaging by combining ambient nanospray desorption electrospray ionization with ion detection using individual ion mass spectrometry. We demonstrate highly multiplexed proteoform imaging of human kidney, annotating 169 of 400 proteoforms of <70 kDa using top-down MS and a database lookup of ~1000 kidney candidate proteoforms, including dozens of key enzymes in primary metabolism. PiMS images reveal distinct spatial localizations of proteoforms to both anatomical structures and cellular neighborhoods in the vasculature, medulla, and cortex regions of the human kidney. The benefits of PiMS are poised to increase proteome coverage for label-free protein imaging of tissues.

2022-08-16

Characterizing cellular heterogeneity in chromatin state with scCUT&Tag-pro

Zhang B, Srivastava A, Mimitou E, Stuart T, Raimondi I, Hao Y, Smibert P, Satija R

HIVE MC-NYGC

Technologies that profile chromatin modifications at single-cell resolution offer enormous promise for functional genomic characterization, but the sparsity of the measurements and integrating multiple binding maps represent substantial challenges. Here we introduce single-cell (sc)CUT&Tag-pro, a multimodal assay for profiling protein-DNA interactions coupled with the abundance of surface proteins in single cells. In addition, we introduce single-cell ChromHMM, which integrates data from multiple experiments to infer and annotate chromatin states based on combinatorial histone modification patterns. We apply these tools to perform an integrated analysis across nine different molecular modalities in circulating human immune cells. We demonstrate how these two approaches can characterize dynamic changes in the function of individual genomic elements across both discrete cell states and continuous developmental trajectories, nominate associated motifs and regulators that establish chromatin states and identify extensive and cell-type-specific regulatory priming. Finally, we demonstrate how our integrated reference can serve as a scaffold to map and improve the interpretation of additional scCUT&Tag datasets.

2022-08-17

Spatial profiling of chromatin accessibility in mouse and human tissues

Deng Y, Bartosovic M, Ma S, Zhang D, Kukanja P, Xiao Y, Su G, Liu Y, Qin X, Rosoklija GB, Dwork AJ, Mann JJ, Xu ML, Halene S, Craft JE, Leong KW, Boldrini M, Castelo-Branco G, Fan R

TTD-Yale

Cellular function in tissue is dependent on the local environment, requiring new methods for spatial mapping of biomolecules and cells in the tissue context¹. The emergence of spatial transcriptomics has enabled genome-scale gene expression mapping^2-5, but the ability to capture spatial epigenetic information of tissue at the cellular level and genome scale is lacking. Here we describe a method for spatially resolved chromatin accessibility profiling of tissue sections using next-generation sequencing (spatial-ATAC-seq) by combining in situ Tn5 transposition chemistry⁶ and microfluidic deterministic barcoding⁵. Profiling mouse embryos using spatial-ATAC-seq delineated tissue-region-specific epigenetic landscapes and identified gene regulators involved in the development of the central nervous system. Mapping the accessible genome in the mouse and human brain revealed the intricate arealization of brain regions. Applying spatial-ATAC-seq to tonsil tissue resolved the spatially distinct organization of immune cell types and states in lymphoid follicles and extrafollicular zones. This technology progresses spatial biology by enabling spatially resolved chromatin accessibility profiling to improve our understanding of cell identity, cell state and cell fate decision in relation to epigenetic underpinnings in development and disease.

2022-08-19

A user-friendly tool for cloud-based whole slide image segmentation with examples from renal histopathology

Lutnick B, Manthey D, Becker JU, Ginley B, Moos K, Zuckerman JE, Rodrigues L, Gallan AJ, Barisoni L, Alpers CE, Wang XX, Myakala K, Jones BA, Levi M, Kopp JB, Yoshida T, Zee J, Han SS, Jain S, Rosenberg AZ, Jen KY, Sarder P; Kidney Precision Medicine Project

TMC-UCSD

Background: Image-based machine learning tools hold great promise for clinical applications in pathology research. However, the ideal end-users of these computational tools (e.g., pathologists and biological scientists) often lack the programming experience required for the setup and use of these tools which often rely on the use of command line interfaces. Methods: We have developed Histo-Cloud, a tool for segmentation of whole slide images (WSIs) that has an easy-to-use graphical user interface. This tool runs a state-of-the-art convolutional neural network (CNN) for segmentation of WSIs in the cloud and allows the extraction of features from segmented regions for further analysis. Results: By segmenting glomeruli, interstitial fibrosis and tubular atrophy, and vascular structures from renal and non-renal WSIs, we demonstrate the scalability, best practices for transfer learning, and effects of dataset variability. Finally, we demonstrate an application for animal model research, analyzing glomerular features in three murine models. Conclusions: Histo-Cloud is open source, accessible over the internet, and adaptable for segmentation of any histological structure regardless of stain.

2022-09-05

Without appropriate metadata, data-sharing mandates are pointless

Musen, MA

HIVE MC-IU

N/A

2022-09-08

Phenotypes of disease severity in a cohort of hospitalized COVID-19 patients: Results from the IMPACC study

Ozonoff A, Schaenman J, Jayavelu ND, Milliren CE, Calfee CS, Cairns CB, Kraft M, Baden LR, Shaw AC, Krammer F, van Bakel H, Esserman DA, Liu S, Sesma AF, Simon V, Hafler DA, Montgomery RR, Kleinstein SH, Levy O, Bime C, Haddad EK, Erle DJ, Pulendran B, Nadeau KC, Davis MM, Hough CL, Messer WB, Higuita NIA, Metcalf JP, Atkinson MA, Brakenridge SC, Corry D, Kheradmand F, Ehrlich LIR, Melamed E, McComsey GA, Sekaly R, Diray-Arce J, Peters B, Augustine AD, Reed EF, Altman MC, Becker PM, Rouphael N; IMPACC study group members

TMC-Florida

Background: Better understanding of the association between characteristics of patients hospitalized with coronavirus disease 2019 (COVID-19) and outcome is needed to further improve upon patient management. Methods: Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) is a prospective, observational study of 1164 patients from 20 hospitals across the United States. Disease severity was assessed using a 7-point ordinal scale based on degree of respiratory illness. Patients were prospectively surveyed for 1 year after discharge for post-acute sequalae of COVID-19 (PASC) through quarterly surveys. Demographics, comorbidities, radiographic findings, clinical laboratory values, SARS-CoV-2 PCR and serology were captured over a 28-day period. Multivariable logistic regression was performed. Findings: The median age was 59 years (interquartile range [IQR] 20); 711 (61%) were men; overall mortality was 14%, and 228 (20%) required invasive mechanical ventilation. Unsupervised clustering of ordinal score over time revealed distinct disease course trajectories. Risk factors associated with prolonged hospitalization or death by day 28 included age ≥ 65 years (odds ratio [OR], 2.01; 95% CI 1.28-3.17), Hispanic ethnicity (OR, 1.71; 95% CI 1.13-2.57), elevated baseline creatinine (OR 2.80; 95% CI 1.63- 4.80) or troponin (OR 1.89; 95% 1.03-3.47), baseline lymphopenia (OR 2.19; 95% CI 1.61-2.97), presence of infiltrate by chest imaging (OR 3.16; 95% CI 1.96-5.10), and high SARS-CoV2 viral load (OR 1.53; 95% CI 1.17-2.00). Fatal cases had the lowest ratio of SARS-CoV-2 antibody to viral load levels compared to other trajectories over time (p=0.001). 589 survivors (51%) completed at least one survey at follow-up with 305 (52%) having at least one symptom consistent with PASC, most commonly dyspnea (56% among symptomatic patients). Female sex was the only associated risk factor for PASC. Interpretation: Integration of PCR cycle threshold, and antibody values with demographics, comorbidities, and laboratory/radiographic findings identified risk factors for 28-day outcome severity, though only female sex was associated with PASC. Longitudinal clinical phenotyping offers important insights, and provides a framework for immunophenotyping for acute and long COVID-19.

2022-09-12

CINS: Cell Interaction Network inference from Single cell expression data

Yuan Y, Cosme C Jr, Adams TS, Schupp J, Sakamoto K, Xylourgidis N, Ruffalo M, Li J, Kaminski N, Bar-Joseph Z

HIVE TC-CMU

Studies comparing single cell RNA-Seq (scRNA-Seq) data between conditions mainly focus on differences in the proportion of cell types or on differentially expressed genes. In many cases these differences are driven by changes in cell interactions which are challenging to infer without spatial information. To determine cell-cell interactions that differ between conditions we developed the Cell Interaction Network Inference (CINS) pipeline. CINS combines Bayesian network analysis with regression-based modeling to identify differential cell type interactions and the proteins that underlie them. We tested CINS on a disease case control and on an aging mouse dataset. In both cases CINS correctly identifies cell type interactions and the ligands involved in these interactions improving on prior methods suggested for cell interaction predictions. We performed additional mouse aging scRNA-Seq experiments which further support the interactions identified by CINS.

2022-09-15

Overcoming selection bias in synthetic lethality prediction

Seale C, Tepeli Y, Gonçalves JP

TMC-Vanderbilt (Eye/pancreas)

Motivation: Synthetic lethality (SL) between two genes occurs when simultaneous loss of function leads to cell death. This holds great promise for developing anti-cancer therapeutics that target synthetic lethal pairs of endogenously disrupted genes. Identifying novel SL relationships through exhaustive experimental screens is challenging, due to the vast number of candidate pairs. Computational SL prediction is therefore sought to identify promising SL gene pairs for further experimentation. However, current SL prediction methods lack consideration for generalizability in the presence of selection bias in SL data. Results: We show that SL data exhibit considerable gene selection bias. Our experiments designed to assess the robustness of SL prediction reveal that models driven by the topology of known SL interactions (e.g. graph, matrix factorization) are especially sensitive to selection bias. We introduce selection bias-resilient synthetic lethality (SBSL) prediction using regularized logistic regression or random forests. Each gene pair is described by 27 molecular features derived from cancer cell line, cancer patient tissue and healthy donor tissue samples. SBSL models are built and tested using approximately 8000 experimentally derived SL pairs across breast, colon, lung and ovarian cancers. Compared to other SL prediction methods, SBSL showed higher predictive performance, better generalizability and robustness to selection bias. Gene dependency, quantifying the essentiality of a gene for cell survival, contributed most to SBSL predictions. Random forests were superior to linear models in the absence of dependency features, highlighting the relevance of mutual exclusivity of somatic mutations, co-expression in healthy tissue and differential expression in tumour samples. Availability and implementation: https://github.com/joanagoncalveslab/sbsl. Supplementary information: Supplementary data are available at Bioinformatics online.

2022-09-20

Enhanced Spatial Mapping of Histone Proteoforms in Human Kidney Through MALDI-MSI by High-Field UHMR-Orbitrap Detection

Zemaitis KJ, Veličković D, Kew W, Fort KL, Reinhardt-Szyba M, Pamreddy A, Ding Y, Kaushik D, Sharma K, Makarov AA, Zhou M, Paša-Tolić L

TTD-PNNL

Core histones including H2A, H2B, H3, and H4 are key modulators of cellular repair, transcription, and replication within eukaryotic cells, playing vital roles in the pathogenesis of disease and cellular responses to environmental stimuli. Traditional mass spectrometry (MS)-based bottom-up and top-down proteomics allows for the comprehensive identification of proteins and of post-translational modification (PTM) harboring proteoforms. However, these methodologies have difficulties preserving near-cellular spatial distributions because they typically require laser capture microdissection (LCM) and advanced sample preparation techniques. Herein, we coupled a matrix-assisted laser desorption/ionization (MALDI) source with a Thermo Scientific Q Exactive HF Orbitrap MS upgraded with ultrahigh mass range (UHMR) boards for the first demonstration of complementary high-resolution accurate mass (HR/AM) measurements of proteoforms up to 16.5 kDa directly from tissues using this benchtop mass spectrometer. The platform achieved isotopic resolution throughout the detected mass range, providing confident assignments of proteoforms with low ppm mass error and a considerable increase in duty cycle over other Fourier transform mass analyzers. Proteoform mapping of core histones was demonstrated on sections of human kidney at near-cellular spatial resolution, with several key distributions of histone and other proteoforms noted within both healthy biopsy and a section from a renal cell carcinoma (RCC) containing nephrectomy. The use of MALDI-MS imaging (MSI) for proteoform mapping demonstrates several steps toward high-throughput accurate identification of proteoforms and provides a new tool for mapping biomolecule distributions throughout tissue sections in extended mass ranges.

2022-09-29

Concurrent inhibition of CDK2 adds to the anti-tumour activity of CDK4/6 inhibition in GIST

Schaefer IM, Hemming ML, Lundberg MZ, Serrata MP, Goldaracena I, Liu N, Yin P, Paulo JA, Gygi SP, George S, Morgan JA, Bertagnolli MM, Sicinska ET, Chu C, Zheng S, Mariño-Enríquez A, Hornick JL, Raut CP, Ou WB, Demetri GD, Saka SK, Fletcher JA

TTD-Harvard

Background: Advanced gastrointestinal stromal tumour (GIST) is characterised by genomic perturbations of key cell cycle regulators. Oncogenic activation of CDK4/6 results in RB1 inactivation and cell cycle progression. Given that single-agent CDK4/6 inhibitor therapy failed to show clinical activity in advanced GIST, we evaluated strategies for maximising response to therapeutic CDK4/6 inhibition. Methods: Targeted next-generation sequencing and multiplexed protein imaging were used to detect cell cycle regulator aberrations in GIST clinical samples. The impact of inhibitors of CDK2, CDK4 and CDK2/4/6 was determined through cell proliferation and protein detection assays. CDK-inhibitor resistance mechanisms were characterised in GIST cell lines after long-term exposure. Results: We identify recurrent genomic aberrations in cell cycle regulators causing co-activation of the CDK2 and CDK4/6 pathways in clinical GIST samples. Therapeutic co-targeting of CDK2 and CDK4/6 is synergistic in GIST cell lines with intact RB1, through inhibition of RB1 hyperphosphorylation and cell proliferation. Moreover, RB1 inactivation and a novel oncogenic cyclin D1 resulting from an intragenic rearrangement (CCND1::chr11.g:70025223) are mechanisms of acquired CDK-inhibitor resistance in GIST. Conclusions: These studies establish the biological rationale for CDK2 and CDK4/6 co-inhibition as a therapeutic strategy in patients with advanced GIST, including metastatic GIST progressing on tyrosine kinase inhibitors.

2022-10-04

Machine learning-assisted elucidation of CD81-CD44 interactions in promoting cancer stemness and extracellular vesicle integrity

Ramos EK, Tsai CF, Jia Y, Cao Y, Manu M, Taftaf R, Hoffmann AD, El-Shennawy L, Gritsenko MA, Adorno-Cruz V, Schuster EJ, Scholten D, Patel D, Liu X, Patel P, Wray B, Zhang Y, Zhang S, Moore RJ, Mathews JV, Schipma MJ, Liu T, Tokars VL, Cristofanilli M, Shi T, Shen Y, Dashzeveg NK, Liu H

TTD-PNNL/Northwestern

Tumor-initiating cells with reprogramming plasticity or stem-progenitor cell properties (stemness) are thought to be essential for cancer development and metastatic regeneration in many cancers; however, elucidation of the underlying molecular network and pathways remains demanding. Combining machine learning and experimental investigation, here we report CD81, a tetraspanin transmembrane protein known to be enriched in extracellular vesicles (EVs), as a newly identified driver of breast cancer stemness and metastasis. Using protein structure modeling and interface prediction-guided mutagenesis, we demonstrate that membrane CD81 interacts with CD44 through their extracellular regions in promoting tumor cell cluster formation and lung metastasis of triple negative breast cancer (TNBC) in human and mouse models. In-depth global and phosphoproteomic analyses of tumor cells deficient with CD81 or CD44 unveils endocytosis-related pathway alterations, leading to further identification of a quality-keeping role of CD44 and CD81 in EV secretion as well as in EV-associated stemness-promoting function. CD81 is coexpressed along with CD44 in human circulating tumor cells (CTCs) and enriched in clustered CTCs that promote cancer stemness and metastasis, supporting the clinical significance of CD81 in association with patient outcomes. Our study highlights machine learning as a powerful tool in facilitating the molecular understanding of new molecular targets in regulating stemness and metastasis of TNBC.

2022-10-06

A Parallelization Strategy for the Time Efficient Analysis of Thousands of LC/MS Runs in High-Performance Computing Environment

van Zalm P, Viodé A, Smolen K, Fatou B, Hayati AN, Schlaffner CN, Levy O, Steen J, Steen H

TMC-Florida

Combining robust proteomics instrumentation with high-throughput enabling liquid chromatography (LC) systems (e.g., timsTOF Pro and the Evosep One system, respectively) enabled mapping the proteomes of 1000s of samples. Fragpipe is one of the few computational protein identification and quantification frameworks that allows for the time-efficient analysis of such large data sets. However, it requires large amounts of computational power and data storage space that leave even state-of-the-art workstations underpowered when it comes to the analysis of proteomics data sets with 1000s of LC mass spectrometry runs. To address this issue, we developed and optimized a Fragpipe-based analysis strategy for a high-performance computing environment and analyzed 3348 plasma samples (6.4 TB) that were longitudinally collected from hospitalized COVID-19 patients under the auspice of the Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) study. Our parallelization strategy reduced the total runtime by ∼90% from 116 (theoretical) days to just 9 days in the high-performance computing environment. All code is open-source and can be deployed in any Simple Linux Utility for Resource Management (SLURM) high-performance computing environment, enabling the analysis of large-scale high-throughput proteomics studies.

2022-10-21

Revving an Engine of Human Metabolism: Activity Enhancement of Triosephosphate Isomerase via Hemi-Phosphorylation

Schachner LF, Soye BD, Ro S, Kenney GE, Ives AN, Su T, Goo YA, Jewett MC, Rosenzweig AC, Kelleher NL

RTI-Northwestern

Triosephosphate isomerase (TPI) performs the 5th step in glycolysis, operates near the limit of diffusion, and is involved in "moonlighting" functions. Its dimer was found singly phosphorylated at Ser20 (pSer20) in human cells, with this post-translational modification (PTM) showing context-dependent stoichiometry and loss under oxidative stress. We generated synthetic pSer20 proteoforms using cell-free protein synthesis that showed enhanced TPI activity by 4-fold relative to unmodified TPI. Molecular dynamics simulations show that the phosphorylation enables a channel to form that shuttles substrate into the active site. Refolding, kinetic, and crystallographic analyses of point mutants including S20E/G/Q indicate that hetero-dimerization and subunit asymmetry are key features of TPI. Moreover, characterization of an endogenous human TPI tetramer also implicates tetramerization in enzymatic regulation. S20 is highly conserved across eukaryotic TPI, yet most prokaryotes contain E/D at this site, suggesting that phosphorylation of human TPI evolved a new switch to optionally boost an already fast enzyme. Overall, complete characterization of TPI shows how endogenous proteoform discovery can prioritize functional versus bystander PTMs.

2022-11-03

Real-time spatial registration for 3D human atlas

Chen L, Teng D, Zhu T, Kong J, Herr BW, Bueckle A, Börner K, Wang F

HIVE MC-IU

The human body is made up of about 37 trillion cells (adults). Each cell has its own unique role and is affected by its neighboring cells and environment. The NIH Human BioMolecular Atlas Program (HuBMAP) aims at developing a 3D atlas of human body consisting of organs, vessels, tissues to singe cells with all 3D spatially registered in a single 3D human atlas using tissues obtained from normal individuals across a wide range of ages. A critical step of building the atlas is to register 3D tissue blocks in real-time to the right location of a human organ, which itself consists of complex 3D sub-structures. The complexity of the 3D organ model, e.g., 35 meshes for a typical kidney, poses a significant computational challenge for the registration. In this paper, we propose a comprehensive framework TICKET (TIssue bloCK rEgisTration) to support tissue block registration for 3D human atlas, including (1) 3D mesh pre-processing, (2) spatial queries on intersection relationship and (3) intersection volume computation between organs and tissue blocks. To minimize search space and computation cost, we develop multi-level indexing at both the anatomical structure level and mesh level, and utilize OpenMP for parallel computing. Considering cuboid based shape of the tissue block, we propose an efficient voxelization-based method to estimate the intersection volume. Our experiments demonstrate that the proposed framework is efficient and practical. TICKET is being integrated into the HuBMAP CCF registration portal [1].

2022-11-04

Single-cell spatial proteomic imaging for human neuropathology

Vijayaragavan K, Cannon BJ, Tebaykin D, Bossé M, Baranski A, Oliveria JP, Bukhari SA, Mrdjen D, Corces MR, McCaffrey EF, Greenwald NF, Sigal Y, Marquez D, Khair Z, Bruce T, Goldston M, Bharadwaj A, Montine KS, Angelo RM, Montine TJ, Bendall SC

RTI-Stanford

Neurodegenerative disorders are characterized by phenotypic changes and hallmark proteopathies. Quantifying these in archival human brain tissues remains indispensable for validating animal models and understanding disease mechanisms. We present a framework for nanometer-scale, spatial proteomics with multiplex ion beam imaging (MIBI) for capturing neuropathological features. MIBI facilitated simultaneous, quantitative imaging of 36 proteins on archival human hippocampus from individuals spanning cognitively normal to dementia. Customized analysis strategies identified cell types and proteopathies in the hippocampus across stages of Alzheimer's disease (AD) neuropathologic change. We show microglia-pathologic tau interactions in hippocampal CA1 subfield in AD dementia. Data driven, sample independent creation of spatial proteomic regions identified persistent neurons in pathologic tau neighborhoods expressing mitochondrial protein MFN2, regardless of cognitive status, suggesting a survival advantage. Our study revealed unique insights from multiplexed imaging and data-driven approaches for neuropathologic analysis and serves broadly as a methodology for spatial proteomic analysis of archival human neuropathology. TEASER: Multiplex Ion beam Imaging enables deep spatial phenotyping of human neuropathology-associated cellular and disease features.

2022-11-11

Multiset multicover methods for discriminative marker selection

Hasanaj E, Alavi A, Gupta A, Póczos B, Bar-Joseph Z

HIVE TC-CMU

Markers are increasingly being used for several high-throughput data analysis and experimental design tasks. Examples include the use of markers for assigning cell types in scRNA-seq studies, for deconvolving bulk gene expression data, and for selecting marker proteins in single-cell spatial proteomics studies. Most marker selection methods focus on differential expression (DE) analysis. Although such methods work well for data with a few non-overlapping marker sets, they are not appropriate for large atlas-size datasets where several cell types and tissues are considered. To address this, we define the phenotype cover (PC) problem for marker selection and present algorithms that can improve the discriminative power of marker sets. Analysis of these sets on several marker-selection tasks suggests that these methods can lead to solutions that accurately distinguish different phenotypes in the data.

2022-11-12

Modeling community standards for metadata as templates makes data FAIR

Musen M.A., O’Connor M. J., Schultes E, Martínez-Romero M., Hardi J., Graybeal, J.

HIVE IEC-PSC

It is challenging to determine whether datasets are findable, accessible, interoperable, and reusable (FAIR) because the FAIR Guiding Principles refer to highly idiosyncratic criteria regarding the metadata used to annotate datasets. Specifically, the FAIR principles require metadata to be “rich” and to adhere to “domain-relevant” community standards. Scientific communities should be able to define their own machine-actionable templates for metadata that encode these “rich,” discipline-specific elements. We have explored this template-based approach in the context of two software systems. One system is the CEDAR Workbench, which investigators use to author new metadata. The other is the FAIRware Workbench, which evaluates the metadata of archived datasets for their adherence to community standards. Benefits accrue when templates for metadata become central elements in an ecosystem of tools to manage online datasets—both because the templates serve as a community reference for what constitutes FAIR data, and because they embody that perspective in a form that can be distributed among a variety of software applications to assist with data stewardship and data sharing.

2022-11-14

Light-Seq: light-directed in situ barcoding of biomolecules in fixed cells and tissues for spatially indexed sequencing

Kishi JY, Liu N, West ER, Sheng K, Jordanides JJ, Serrata M, Cepko CL, Saka SK, Yin P

TTD-Harvard

We present Light-Seq, an approach for multiplexed spatial indexing of intact biological samples using light-directed DNA barcoding in fixed cells and tissues followed by ex situ sequencing. Light-Seq combines spatially targeted, rapid photocrosslinking of DNA barcodes onto complementary DNAs in situ with a one-step DNA stitching reaction to create pooled, spatially indexed sequencing libraries. This light-directed barcoding enables in situ selection of multiple cell populations in intact fixed tissue samples for full-transcriptome sequencing based on location, morphology or protein stains, without cellular dissociation. Applying Light-Seq to mouse retinal sections, we recovered thousands of differentially enriched transcripts from three cellular layers and discovered biomarkers for a very rare neuronal subtype, dopaminergic amacrine cells, from only four to eight individual cells per section. Light-Seq provides an accessible workflow to combine in situ imaging and protein staining with next generation sequencing of the same cells, leaving the sample intact for further analysis post-sequencing.

2022-11-15

GBZ file format for pangenome graphs

Sirén J, Paten B

HIVE TC-CMU

Motivation: Pangenome graphs representing aligned genome assemblies are being shared in the text-based Graphical Fragment Assembly format. As the number of assemblies grows, there is a need for a file format that can store the highly repetitive data space efficiently. Results: We propose the GBZ file format based on data structures used in the Giraffe short-read aligner. The format provides good compression, and the files can be efficiently loaded into in-memory data structures. We provide compression and decompression tools and libraries for using GBZ graphs, and we show that they can be efficiently used on a variety of systems. Availability and implementation: C++ and Rust implementations are available at https://github.com/jltsiren/gbwtgraph and https://github.com/jltsiren/gbwt-rs, respectively.

2022-11-15

Gestationally dependent immune organization at the maternal-fetal interface

Moore AR, Vivanco Gonzalez N, Plummer KA, Mitchel OR, Kaur H, Rivera M, Collica B, Goldston M, Filiz F, Angelo M, Palmer TD, Bendall SC

RTI-Stanford

The immune system and placenta have a dynamic relationship across gestation to accommodate fetal growth and development. High-resolution characterization of this maternal-fetal interface is necessary to better understand the immunology of pregnancy and its complications. We developed a single-cell framework to simultaneously immuno-phenotype circulating, endovascular, and tissue-resident cells at the maternal-fetal interface throughout gestation, discriminating maternal and fetal contributions. Our data reveal distinct immune profiles across the endovascular and tissue compartments with tractable dynamics throughout gestation that respond to a systemic immune challenge in a gestationally dependent manner. We uncover a significant role for the innate immune system where phagocytes and neutrophils drive temporal organization of the placenta through remarkably diverse populations, including PD-L1⁺ subsets having compartmental and early gestational bias. Our approach and accompanying datasets provide a resource for additional investigations into gestational immunology and evoke a more significant role for the innate immune system in establishing the microenvironment of early pregnancy.

2022-11-16

An optimized approach and inflation media for obtaining complimentary mass spectrometry-based omics data from human lung tissue

Lukowski JK, Olson H, Velickovic M, Wang J, Kyle JE, Kim YM, Williams SM, Zhu Y, Huyck HL, McGraw MD, Poole C, Rogers L, Misra R, Alexandrov T, Ansong C, Pryhuber GS, Clair G, Adkins JN, Carson JP, Anderton CR

TMC-URMC

Human disease states are biomolecularly multifaceted and can span across phenotypic states, therefore it is important to understand diseases on all levels, across cell types, and within and across microanatomical tissue compartments. To obtain an accurate and representative view of the molecular landscape within human lungs, this fragile tissue must be inflated and embedded to maintain spatial fidelity of the location of molecules and minimize molecular degradation for molecular imaging experiments. Here, we evaluated agarose inflation and carboxymethyl cellulose embedding media and determined effective tissue preparation protocols for performing bulk and spatial mass spectrometry-based omics measurements. Mass spectrometry imaging methods were optimized to boost the number of annotatable molecules in agarose inflated lung samples. This optimized protocol permitted the observation of unique lipid distributions within several airway regions in the lung tissue block. Laser capture microdissection of these airway regions followed by high-resolution proteomic analysis allowed us to begin linking the lipidome with the proteome in a spatially resolved manner, where we observed proteins with high abundance specifically localized to the airway regions. We also compared our mass spectrometry results to lung tissue samples preserved using two other inflation/embedding media, but we identified several pitfalls with the sample preparation steps using this preservation method. Overall, we demonstrated the versatility of the inflation method, and we can start to reveal how the metabolome, lipidome, and proteome are connected spatially in human lungs and across disease states through a variety of different experiments.

2022-11-30

Annotation of Spatially Resolved Single-cell Data with STELLAR

Brbić M, Cao K, Hickey JW, Tan Y, Snyder MP, Nolan GP, Leskovec J

TMC-Stanford

Accurate cell-type annotation from spatially resolved single cells is crucial to understand functional spatial biology that is the basis of tissue organization. However, current computational methods for annotating spatially resolved single-cell data are typically based on techniques established for dissociated single-cell technologies and thus do not take spatial organization into account. Here we present STELLAR, a geometric deep learning method for cell-type discovery and identification in spatially resolved single-cell datasets. STELLAR automatically assigns cells to cell types present in the annotated reference dataset and discovers novel cell types and cell states. STELLAR transfers annotations across different dissection regions, different tissues and different donors, and learns cell representations that capture higher-order tissue structures. We successfully applied STELLAR to CODEX multiplexed fluorescent microscopy data and multiplexed RNA imaging datasets. Within the Human BioMolecular Atlas Program, STELLAR has annotated 2.6 million spatially resolved single cells with dramatic time savings.

2022-11-30

UNIFAN: A Tool for Unsupervised Single-Cell Clustering and Annotation

Li D, Ding J, Bar-Joseph Z

HIVE TC-CMU

UNIFAN is an unsupervised cell type annotation tool for single-cell RNA sequencing data (scRNA-seq). Given single-cell expression data as input, UNIFAN outputs cell clusters as well as annotations for each cluster. The clustering process utilizes information on pathways and biological processes and these are also used to annotate the resulting clusters. In this software article, we focus on how to install UNIFAN and on the main steps involved in using UNIFAN for cell type annotations.

2022-12-06

Understanding islet dysfunction in type 2 diabetes through multidimensional pancreatic phenotyping: The Human Pancreas Analysis Program

Shapira SN, Naji A, Atkinson MA, Powers AC, Kaestner KH

TMC-Vanderbilt (Eye/pancreas)

In this perspective, we provide an overview of a recently established National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) initiative, the Human Pancreas Analysis Program for Type 2 Diabetes (HPAP-T2D). This program is designed to define the molecular pathogenesis of islet dysfunction by studying human pancreatic tissue samples from organ donors with T2D. HPAP-T2D generates detailed datasets of physiological, histological, transcriptomic, epigenomic, and genomic information. Importantly, all data collected, generated, and analyzed by HPAP-T2D are made immediately and freely available through a centralized database, PANC-DB, thus providing a comprehensive data resource for the diabetes research community.

2022-12-08

Machine-learning based investigation of prognostic indicators for oncological outcome of pancreatic ductal adenocarcinoma

Chang J, Liu Y, Saey SA, Chang KC, Shrader HR, Steckly KL, Rajput M, Sonka M, Chan CHF

HIVE IEC-PSC

Introduction: Pancreatic ductal adenocarcinoma (PDAC) is an aggressive malignancy with a poor prognosis. Surgical resection remains the only potential curative treatment option for early-stage resectable PDAC. Patients with locally advanced or micrometastatic disease should ideally undergo neoadjuvant therapy prior to surgical resection for an optimal treatment outcome. Computerized tomography (CT) scan is the most common imaging modality obtained prior to surgery. However, the ability of CT scans to assess the nodal status and resectability remains suboptimal and depends heavily on physician experience. Improved preoperative radiographic tumor staging with the prediction of postoperative margin and the lymph node status could have important implications in treatment sequencing. This paper proposes a novel machine learning predictive model, utilizing a three-dimensional convoluted neural network (3D-CNN), to reliably predict the presence of lymph node metastasis and the postoperative positive margin status based on preoperative CT scans. Methods: A total of 881 CT scans were obtained from 110 patients with PDAC. Patients and images were separated into training and validation groups for both lymph node and margin prediction studies. Per-scan analysis and per-patient analysis (utilizing majority voting method) were performed. Results: For a lymph node prediction 3D-CNN model, accuracy was 90% for per-patient analysis and 75% for per-scan analysis. For a postoperative margin prediction 3D-CNN model, accuracy was 81% for per-patient analysis and 76% for per-scan analysis. Discussion: This paper provides a proof of concept that utilizing radiomics and the 3D-CNN deep learning framework may be used preoperatively to improve the prediction of positive resection margins as well as the presence of lymph node metastatic disease. Further investigations should be performed with larger cohorts to increase the generalizability of this model; however, there is a great promise in the use of convoluted neural networks to assist clinicians with treatment selection for patients with PDAC.

2022-12-09

Emerging Computational Methods in Mass Spectrometry Imaging

Hu H, Laskin J

TTD-Purdue

Mass spectrometry imaging (MSI) is a powerful analytical technique that generates maps of hundreds of molecules in biological samples with high sensitivity and molecular specificity. Advanced MSI platforms with capability of high-spatial resolution and high-throughput acquisition generate vast amount of data, which necessitates the development of computational tools for MSI data analysis. In addition, computation-driven MSI experiments have recently emerged as enabling technologies for further improving the MSI capabilities with little or no hardware modification. This review provides a critical summary of computational methods and resources developed for MSI data analysis and interpretation along with computational approaches for improving throughput and molecular coverage in MSI experiments. This review is focused on the recently developed artificial intelligence methods and provides an outlook for a future paradigm shift in MSI with transformative computational methods.

2022-12-13

Tissue registration and exploration user interfaces in support of a human reference atlas

Börner K, Bueckle A, Herr BW 2nd, Cross LE, Quardokus EM, Record EG, Ju Y, Silverstein JC, Browne KM, Jain S, Wasserfall CH, Jorgensen ML, Spraggins JM, Patterson NH, Weber GM

Consortium

Seventeen international consortia are collaborating on a human reference atlas (HRA), a comprehensive, high-resolution, three-dimensional atlas of all the cells in the healthy human body. Laboratories around the world are collecting tissue specimens from donors varying in sex, age, ethnicity, and body mass index. However, harmonizing tissue data across 25 organs and more than 15 bulk and spatial single-cell assay types poses challenges. Here, we present software tools and user interfaces developed to spatially and semantically annotate ("register") and explore the tissue data and the evolving HRA. A key part of these tools is a common coordinate framework, providing standard terminologies and data structures for describing specimen, biological structure, and spatial data linked to existing ontologies. As of April 22, 2022, the "registration" user interface has been used to harmonize and publish data on 5,909 tissue blocks collected by the Human Biomolecular Atlas Program (HuBMAP), the Stimulating Peripheral Activity to Relieve Conditions program (SPARC), the Human Cell Atlas (HCA), the Kidney Precision Medicine Project (KPMP), and the Genotype Tissue Expression project (GTEx). Further, 5,856 tissue sections were derived from 506 HuBMAP tissue blocks. The second "exploration" user interface enables consortia to evaluate data quality, explore tissue data spatially within the context of the HRA, and guide data acquisition. A companion website is at https://cns-iu.github.io/HRA-supporting-information/ .

2023-01-06

MatrisomeDB 2.0: 2023 updates to the ECM-protein knowledge database

Shao X, Gomez CD, Kapoor N, Considine JM, Grams C, Gao YT, Naba A

DP-Illinois

The extracellular matrix (ECM) is a complex assembly of proteins that constitutes the scaffold organizing cells, tissues, and organs. Over the past decade, mass-spectrometry-based proteomics has become the method of choice to profile the composition of the ECM, or the matrisome, of tissues. To assist non-specialists with the reuse of ECM proteomic datasets, we released MatrisomeDB (https://matrisomedb.org) in 2020. Here, we report the expansion of the database to include 25 new curated studies on the ECM of 24 new tissues in addition to datasets on tissues previously included, more than doubling the size of the original database and achieving near-complete coverage of the in-silico predicted matrisome. We further enhanced data visualization by maps of peptides and post-translational-modifications detected onto domain-based representations and 3D structures of ECM proteins. We also referenced external resources to facilitate the design of targeted mass spectrometry assays. Last, we implemented an abstract-mining tool that generates an enrichment word cloud from abstracts of studies in which a queried protein is found with higher confidence and higher abundance relative to other studies in MatrisomeDB.

2023-01-14

Creating a common language for the subanatomy of the ovary

Tsui EL, O'Neill KE, LeDuc RD, Shikanov A, Gomez-Lobo V, Laronda MM

TMC-UPenn

2023-01-16

Haplotype-aware pantranscriptome analyses using spliced pangenome graphs

Sibbesen JA, Eizenga JM, Novak AM, Sirén J, Chang X, Garrison E, Paten B

HIVE TC-CMU

Pangenomics is emerging as a powerful computational paradigm in bioinformatics. This field uses population-level genome reference structures, typically consisting of a sequence graph, to mitigate reference bias and facilitate analyses that were challenging with previous reference-based methods. In this work, we extend these methods into transcriptomics to analyze sequencing data using the pantranscriptome: a population-level transcriptomic reference. Our toolchain, which consists of additions to the VG toolkit and a standalone tool, RPVG, can construct spliced pangenome graphs, map RNA sequencing data to these graphs, and perform haplotype-aware expression quantification of transcripts in a pantranscriptome. We show that this workflow improves accuracy over state-of-the-art RNA sequencing mapping methods, and that it can efficiently quantify haplotype-specific transcript expression without needing to characterize the haplotypes of a sample beforehand.

2023-01-18

A streamlined tandem tip-based workflow for sensitive nanoscale phosphoproteomics

Tsai CF, Wang YT, Hsu CC, Kitata RB, Chu RK, Velickovic M, Zhao R, Williams SM, Chrisler WB, Jorgensen ML, Moore RJ, Zhu Y, Rodland KD, Smith RD, Wasserfall CH, Shi T, Liu T

TTD-PNNL/Northwestern

Effective phosphoproteome of nanoscale sample analysis remains a daunting task, primarily due to significant sample loss associated with non-specific surface adsorption during enrichment of low stoichiometric phosphopeptide. We develop a tandem tip phosphoproteomics sample preparation method that is capable of sample cleanup and enrichment without additional sample transfer, and its integration with our recently developed SOP (Surfactant-assisted One-Pot sample preparation) and iBASIL (improved Boosting to Amplify Signal with Isobaric Labeling) approaches provides a streamlined workflow enabling sensitive, high-throughput nanoscale phosphoproteome measurements. This approach significantly reduces both sample loss and processing time, allowing the identification of >3000 (>9500) phosphopeptides from 1 (10) µg of cell lysate using the label-free method without a spectral library. It also enables precise quantification of ~600 phosphopeptides from 100 sorted cells (single-cell level input for the enriched phosphopeptides) and ~700 phosphopeptides from human spleen tissue voxels with a spatial resolution of 200 µm (equivalent to ~100 cells) in a high-throughput manner. The new workflow opens avenues for phosphoproteome profiling of mass-limited samples at the low nanogram level.

2023-01-31

Polyphony: an Interactive Transfer Learning Framework for Single-Cell Data Analysis

Cheng F, Keller MS, Qu H, Gehlenborg N, Wang Q

HIVE TC-Harvard

Reference-based cell-type annotation can significantly reduce time and effort in single-cell analysis by transferring labels from a previously-annotated dataset to a new dataset. However, label transfer by end-to-end computational methods is challenging due to the entanglement of technical (e.g., from different sequencing batches or techniques) and biological (e.g., from different cellular microenvironments) variations, only the first of which must be removed. To address this issue, we propose Polyphony, an interactive transfer learning (ITL) framework, to complement biologists' knowledge with advanced computational methods. Polyphony is motivated and guided by domain experts' needs for a controllable, interactive, and algorithm-assisted annotation process, identified through interviews with seven biologists. We introduce anchors, i.e., analogous cell populations across datasets, as a paradigm to explain the computational process and collect user feedback for model improvement. We further design a set of visualizations and interactions to empower users to add, delete, or modify anchors, resulting in refined cell type annotations. The effectiveness of this approach is demonstrated through quantitative experiments, two hypothetical use cases, and interviews with two biologists. The results show that our anchor-based ITL method takes advantage of both human and machine intelligence in annotating massive single-cell datasets.

2023-01-31

A Multicompartmental Diffusion Model for Improved Assessment of Whole-Body Diffusion-weighted Imaging Data and Evaluation of Prostate Cancer Bone Metastases

Conlin CC, Feng CH, Digma LA, Rodríguez-Soto AE, Kuperman JM, Rakow-Penner R, Karow DS, White NS, Seibert TM, Hahn ME, Dale AM

TMC-UCSD (Female reproduction)

Purpose To develop a multicompartmental signal model for whole-body diffusion-weighted imaging (DWI) and apply it to study the diffusion properties of normal tissue and metastatic prostate cancer bone lesions in vivo. Materials and Methods This prospective study (ClinicalTrials.gov: NCT03440554) included 139 men with prostate cancer (mean age, 70 years ± 9 [SD]). Multicompartmental models with two to four tissue compartments were fit to DWI data from whole-body scans to determine optimal compartmental diffusion coefficients. Bayesian information criterion (BIC) and model-fitting residuals were calculated to quantify model complexity and goodness of fit. Diffusion coefficients for the optimal model (having lowest BIC) were used to compute compartmental signal-contribution maps. The signal intensity ratio (SIR) of bone lesions to normal-appearing bone was measured on these signal-contribution maps and on conventional DWI scans and compared using paired t tests (α = .05). Two-sample t tests (α = .05) were used to compare compartmental signal fractions between lesions and normal-appearing bone. Results Lowest BIC was observed from the four-compartment model, with optimal compartmental diffusion coefficients of 0, 1.1 × 10^-3, 2.8 × 10^-3, and >3.0 ×10^-2 mm²/sec. Fitting residuals from this model were significantly lower than from conventional apparent diffusion coefficient mapping (P < .001). Bone lesion SIR was significantly higher on signal-contribution maps of model compartments 1 and 2 than on conventional DWI scans (P < .008). The fraction of signal from compartments 2, 3, and 4 was also significantly different between metastatic bone lesions and normal-appearing bone tissue (P ≤ .02). Conclusion The four-compartment model best described whole-body diffusion properties. Compartmental signal contributions from this model can be used to examine prostate cancer bone involvement.

2023-02-01

193 nm Ultraviolet Photodissociation for the Characterization of Singly Charged Proteoforms Generated by MALDI

Zemaitis KJ, Zhou M, Kew W, Paša-Tolić L

TTD-PNNL

MALDI imaging allows for the near-cellular profiling of proteoforms directly from microbial, plant, and mammalian samples. Despite detecting hundreds of proteoforms, identification of unknowns with only intact mass information remains a distinct challenge, even with high mass resolving power and mass accuracy. To this end, many supplementary methods have been used to create experimental databases for accurate mass matching, including bulk or spatially resolved bottom-up and/or top-down proteomics. Herein, we describe the application of 193 nm ultraviolet photodissociation (UVPD) for fragmentation of quadrupole isolated singly charged ubiquitin (m/z 8565) by MALDI-UVPD on a UHMR HF Orbitrap. This platform permitted the high-resolution accurate mass measurement of not just terminal fragments but also large internal fragments. The outlined workflow demonstrates the feasibility of top-down analyses of isolated MALDI protein ions and the potential toward more comprehensive characterization of proteoforms in MALDI imaging applications.

2023-02-01

Long noncoding RNA LEENE promotes angiogenesis and ischemic recovery in diabetes models

Tang X, Luo Y, Yuan D, Calandrelli R, Malhi NK, Sriram K, Miao Y, Lou CH, Tsark W, Tapia A, Chen AT, Zhang G, Roeth D, Kalkum M, Wang ZV, Chien S, Natarajan R, Cooke JP, Zhong S, Chen ZB

TTD-UCSD/City of Hope

Impaired angiogenesis in diabetes is a key process contributing to ischemic diseases such as peripheral arterial disease. Epigenetic mechanisms, including those mediated by long noncoding RNAs (lncRNAs), are crucial links connecting diabetes and the related chronic tissue ischemia. Here we identify the lncRNA that enhances endothelial nitric oxide synthase (eNOS) expression (LEENE) as a regulator of angiogenesis and ischemic response. LEENE expression was decreased in diabetic conditions in cultured endothelial cells (ECs), mouse hind limb muscles, and human arteries. Inhibition of LEENE in human microvascular ECs reduced their angiogenic capacity with a dysregulated angiogenic gene program. Diabetic mice deficient in Leene demonstrated impaired angiogenesis and perfusion following hind limb ischemia. Importantly, overexpression of human LEENE rescued the impaired ischemic response in Leene-knockout mice at tissue functional and single-cell transcriptomic levels. Mechanistically, LEENE RNA promoted transcription of proangiogenic genes in ECs, such as KDR (encoding VEGFR2) and NOS3 (encoding eNOS), potentially by interacting with LEO1, a key component of the RNA polymerase II-associated factor complex and MYC, a crucial transcription factor for angiogenesis. Taken together, our findings demonstrate an essential role for LEENE in the regulation of angiogenesis and tissue perfusion. Functional enhancement of LEENE to restore angiogenesis for tissue repair and regeneration may represent a potential strategy to tackle ischemic vascular diseases.

2023-02-03

Optimal gap-affine alignment in O(s) space

Marco-Sola S, Eizenga JM, Guarracino A, Paten B, Garrison E, Moreto M.

HIVE TC-CMU

Motivation: Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algorithms based on dynamic programming are strongly limited by quadratic requirements in time and memory. The recently proposed wavefront alignment algorithm (WFA) introduced an efficient algorithm to perform exact gap-affine alignment in O(ns) time, where s is the optimal score and n is the sequence length. Notwithstanding these bounds, WFA's O(s2) memory requirements become computationally impractical for genome-scale alignments, leading to a need for further improvement. Results: In this article, we present the bidirectional WFA algorithm, the first gap-affine algorithm capable of computing optimal alignments in O(s) memory while retaining WFA's time complexity of O(ns). As a result, this work improves the lowest known memory bound O(n) to compute gap-affine alignments. In practice, our implementation never requires more than a few hundred MBs aligning noisy Oxford Nanopore Technologies reads up to 1 Mbp long while maintaining competitive execution times. Availability and implementation: All code is publicly available at https://github.com/smarco/BiWFA-paper. Supplementary information: Supplementary data are available at Bioinformatics online.

2023-02-24

Deep Learning Approach for Dynamic Sampling for Multichannel Mass Spectrometry Imaging

Helminiak D, Hu H, Laskin J, Hye Ye D

TTD-Purdue

Mass Spectrometry Imaging (MSI), using traditional rectilinear scanning, takes hours to days for high spatial resolution acquisitions. Given that most pixels within a sample's field of view are often neither relevant to underlying biological structures nor chemically informative, MSI presents as a prime candidate for integration with sparse and dynamic sampling algorithms. During a scan, stochastic models determine which locations probabilistically contain information critical to the generation of low-error reconstructions. Decreasing the number of required physical measurements thereby minimizes overall acquisition times. A Deep Learning Approach for Dynamic Sampling (DLADS), utilizing a Convolutional Neural Network (CNN) and encapsulating molecular mass intensity distributions within a third dimension, demonstrates a simulated 70% throughput improvement for Nanospray Desorption Electrospray Ionization (nano-DESI) MSI tissues. Evaluations are conducted between DLADS, a Supervised Learning Approach for Dynamic Sampling, with Least-Squares regression (SLADS-LS), and a Multi-Layer Perceptron (MLP) network (SLADS-Net). When compared with SLADS-LS, limited to a single

m / z

channel, as well as multichannel SLADS-LS and SLADS-Net, DLADS respectively improves regression performance by 36.7%, 7.0%, and 6.2%, resulting in gains to reconstruction quality of 6.0%, 2.1%, and 3.4% for acquisition of targeted

m / z

2023-02-28

Spatially Resolved Top-Down Proteomics of Tissue Sections Based on a Microfluidic Nanodroplet Sample Preparation Platform

Liao YC, Fulcher JM, Degnan DJ, Williams SM, Bramer LM, Veličković D, Zemaitis KJ, Veličković M, Sontag RL, Moore RJ, Paša-Tolić L, Zhu Y, Zhou M

TTD-PNNL

Conventional proteomic approaches measure the averaged signal from mixed cell populations or bulk tissues, leading to the dilution of signals arising from subpopulations of cells that might serve as important biomarkers. Recent developments in bottom-up proteomics have enabled spatial mapping of cellular heterogeneity in tissue microenvironments. However, bottom-up proteomics cannot unambiguously define and quantify proteoforms, which are intact (i.e., functional) forms of proteins capturing genetic variations, alternatively spliced transcripts and posttranslational modifications. Herein, we described a spatially resolved top-down proteomics (TDP) platform for proteoform identification and quantitation directly from tissue sections. The spatial TDP platform consisted of a nanodroplet processing in one pot for trace samples-based sample preparation system and an laser capture microdissection-based cell isolation system. We improved the nanodroplet processing in one pot for trace samples sample preparation by adding benzonase in the extraction buffer to enhance the coverage of nucleus proteins. Using ∼200 cultured cells as test samples, this approach increased total proteoform identifications from 493 to 700; with newly identified proteoforms primarily corresponding to nuclear proteins. To demonstrate the spatial TDP platform in tissue samples, we analyzed laser capture microdissection-isolated tissue voxels from rat brain cortex and hypothalamus regions. We quantified 509 proteoforms within the union of top-down mass spectrometry-based proteoform identification and characterization and TDPortal identifications to match with features from protein mass extractor. Several proteoforms corresponding to the same gene exhibited mixed abundance profiles between two tissue regions, suggesting potential posttranslational modification-specific spatial distributions. The spatial TDP workflow has prospects for biomarker discovery at proteoform level from small tissue sections.

2023-02-28

Unsupervised Registration Refinement for Generating Unbiased Eye Atlas

Lee HH, Tang Y, Bao S, Yang Q, Xu X, Schey KL, Spraggins JM, Huo Y, Landman BA

TMC-Vanderbilt (Kidney)

With the confounding effects of demographics across large-scale imaging surveys, substantial variation is demonstrated with the volumetric structure of orbit and eye anthropometry. Such variability increases the level of difficulty to localize the anatomical features of the eye organs for populational analysis. To adapt the variability of eye organs with stable registration transfer, we propose an unbiased eye atlas template followed by a hierarchical coarse-to-fine approach to provide generalized eye organ context across populations. Furthermore, we retrieved volumetric scans from 1842 healthy patients for generating an eye atlas template with minimal biases. Briefly, we select 20 subject scans and use an iterative approach to generate an initial unbiased template. We then perform metric-based registration to the remaining samples with the unbiased template and generate coarse registered outputs. The coarse registered outputs are further leveraged to train a deep probabilistic network, which aims to refine the organ deformation in unsupervised setting. Computed tomography (CT) scans of 100 de-identified subjects are used to generate and evaluate the unbiased atlas template with the hierarchical pipeline. The refined registration shows the stable transfer of the eye organs, which were well-localized in the high-resolution (0.5 mm³) atlas space and demonstrated a significant improvement of 2.37% Dice for inverse label transfer performance. The subject-wise qualitative representations with surface rendering successfully demonstrate the transfer details of the organ context and showed the applicability of generalizing the morphological variation across patients.

2023-03-01

Mass spectrometry-based targeted proteomics for analysis of protein mutations

Lin TT, Zhang T, Kitata RB, Liu T, Smith RD, Qian WJ, Shi T

TTD-PNNL/Northwestern

Cancers are caused by accumulated DNA mutations. This recognition of the central role of mutations in cancer and recent advances in next-generation sequencing, has initiated the massive screening of clinical samples and the identification of 1000s of cancer-associated gene mutations. However, proteomic analysis of the expressed mutation products lags far behind genomic (transcriptomic) analysis. With comprehensive global proteomics analysis, only a small percentage of single nucleotide variants detected by DNA and RNA sequencing have been observed as single amino acid variants due to current technical limitations. Proteomic analysis of mutations is important with the potential to advance cancer biomarker development and the discovery of new therapeutic targets for more effective disease treatment. Targeted proteomics using selected reaction monitoring (also known as multiple reaction monitoring) and parallel reaction monitoring, has emerged as a powerful tool with significant advantages over global proteomics for analysis of protein mutations in terms of detection sensitivity, quantitation accuracy and overall practicality (e.g., reliable identification and the scale of quantification). Herein we review recent advances in the targeted proteomics technology for enhancing detection sensitivity and multiplexing capability and highlight its broad biomedical applications for analysis of protein mutations in human bodily fluids, tissues, and cell lines. Furthermore, we review recent applications of top-down proteomics for analysis of protein mutations. Unlike the commonly used bottom-up proteomics which requires digestion of proteins into peptides, top-down proteomics directly analyzes intact proteins for more precise characterization of mutation isoforms. Finally, general perspectives on the potential of achieving both high sensitivity and high sample throughput for large-scale targeted detection and quantification of important protein mutations are discussed.

2023-03-01

Human pancreatic capillaries and nerve fibers persist in type 1 diabetes despite beta cell loss

Richardson TM, Saunders DC, Haliyur R, Shrestha S, Cartailler JP, Reinert RB, Petronglo J, Bottino R, Aramandla R, Bradley AM, Jenkins R, Phillips S, Kang H; Human Pancreas Analysis Program; Caidedo A, Powers AC, Brissova M

TMC-Vanderbilt (Eye/pancreas)

The autonomic nervous system regulates pancreatic function. Islet capillaries are essential for the extension of axonal projections into islets, and both of these structures are important for appropriate islet hormone secretion. Because beta cells provide important paracrine cues for islet glucagon secretion and neurovascular development, we postulated that beta cell loss in type 1 diabetes (T1D) would lead to a decline in intra-islet capillaries and reduction of islet innervation, possibly contributing to abnormal glucagon secretion. To define morphological characteristics of capillaries and nerve fibers in islets and acinar tissue compartments, we analyzed neurovascular assembly across the largest cohort of T1D and normal individuals studied thus far. Because innervation has been studied extensively in rodent models of T1D, we also compared the neurovascular architecture between mouse and human pancreas and assembled transcriptomic profiles of molecules guiding islet angiogenesis and neuronal development. We found striking inter-species differences in islet neurovascular assembly but relatively modest differences at transcriptome level, suggesting post-transcriptional regulation may be involved in this process. To determine if islet neurovascular arrangement is altered following beta cell loss in T1D, we compared pancreatic tissues from non-diabetic, recent-onset T1D (<10 years duration), and longstanding T1D donors (>10 years duration). Both islets and acinar tissue had greater capillary density in recent-onset T1D accompanied by overall greater islet nerve fiber density in recent-onset and longstanding T1D as visualized by a pan-neuronal marker. We did not detect changes in sympathetic axons in either T1D cohort. Additionally, nerve fibers overlapped with extracellular matrix (ECM), supporting its role in the formation and function of axonal processes. These results indicate that pancreatic capillaries and nerve fibers persist in T1D despite beta cell loss, suggesting that alpha cell secretory changes may be decoupled from neurovascular components.

2023-03-01

Anatomic nomenclature and 3-dimensional regional model of the human ovary: call for a new paradigm

O'Neill KE, Maher JY, Laronda MM, Duncan FE, LeDuc RD, Lujan ME, Oktay KH, Pouch AM, Segars JH, Tsui EL, Zelinski MB, Halvorson LM, Gomez-Lobo V

TMC-UPenn

The ovaries are the female gonads that are crucial for reproduction, steroid production, and overall health. Historically, the ovary was broadly divided into regions defined as the cortex, medulla, and hilum. This current nomenclature lacks specificity and fails to consider the significant anatomic variations in the ovary. Recent technological advances in imaging modalities and high-resolution omic analyses have brought about the need for revision of the existing definitions, which will facilitate the integration of generated data and enable the characterization of organ subanatomy and function at the cellular level. The creation of these high-resolution multimodal maps of the ovary will enhance collaboration and communication among disciplines and between clinicians and researchers. Beginning in March 2021, the Pediatric and Adolescent Gynecology Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development invited subject-matter experts to participate in a series of workshops and meetings to standardize ovarian nomenclature and define the organ's features. The goal was to develop a spatially defined and semantically consistent terminology of the ovary to support collaborative, team science-based endeavors aimed at generating reference atlases of the human ovary. The group recommended a standardized, 3-dimensional description of the ovary and an ontological approach to the subanatomy of the ovary and definition of follicles. This new greater precision in nomenclature and mapping will better reflect the ovary's heterogeneous composition and function, support the standardization of tissue collection, facilitate functional analyses, and enable clinical and research collaborations. The conceptualization process and outcomes of the effort, which spanned the better part of 2021 and early 2022, are introduced in this article. The institute and the workshop participants encourage researchers and clinicians to adopt the new systems in their everyday work to advance the overarching goal of improving human reproductive health.

2023-03-27

Specimen, biological structure, and spatial ontologies in support of a Human Reference Atlas

Herr BW 2nd, Hardi J, Quardokus EM, Bueckle A, Chen L, Wang F, Caron AR, Osumi-Sutherland D, Musen MA, Börner K

HIVE MC-IU

The Human Reference Atlas (HRA) is defined as a comprehensive, three-dimensional (3D) atlas of all the cells in the healthy human body. It is compiled by an international team of experts who develop standard terminologies that they link to 3D reference objects, describing anatomical structures. The third HRA release (v1.2) covers spatial reference data and ontology annotations for 26 organs. Experts access the HRA annotations via spreadsheets and view reference object models in 3D editing tools. This paper introduces the Common Coordinate Framework (CCF) Ontology v2.0.1 that interlinks specimen, biological structure, and spatial data, together with the CCF API that makes the HRA programmatically accessible and interoperable with Linked Open Data (LOD). We detail how real-world user needs and experimental data guide CCF Ontology design and implementation, present CCF Ontology classes and properties together with exemplary usage, and report on validation methods. The CCF Ontology graph database and API are used in the HuBMAP portal, HRA Organ Gallery, and other applications that support data queries across multiple, heterogeneous sources.

2023-03-28

Nano-DESI Mass Spectrometry Imaging of Proteoforms in Biological Tissues with High Spatial Resolution

Yang M, Unsihuay D, Hu H, Nguele Meke F, Qu Z, Zhang ZY, Laskin J

TTD-Purdue

Mass spectrometry imaging (MSI) is a powerful tool for label-free mapping of the spatial distribution of proteins in biological tissues. We have previously demonstrated imaging of individual proteoforms in biological tissues using nanospray desorption electrospray ionization (nano-DESI), an ambient liquid extraction-based MSI technique. Nano-DESI MSI generates multiply charged protein ions, which is advantageous for their identification using top-down proteomics analysis. In this study, we demonstrate proteoform mapping in biological tissues with a spatial resolution down to 7 μm using nano-DESI MSI. A substantial decrease in protein signals observed in high-spatial-resolution MSI makes these experiments challenging. We have enhanced the sensitivity of nano-DESI MSI experiments by optimizing the design of the capillary-based probe and the thickness of the tissue section. In addition, we demonstrate that oversampling may be used to further improve spatial resolution at little or no expense to sensitivity. These developments represent a new step in MSI-based spatial proteomics, which complements targeted imaging modalities widely used for studying biological systems.

2023-03-31

Super-resolution SRS microscopy with A-PoD

Jang H, Li Y, Fung AA, Bagheri P, Hoang K, Skowronska-Krawczyk D, Chen X, Wu JY, Bintu B, Shi L

TMC-WUSTL

Stimulated Raman scattering (SRS) offers the ability to image metabolic dynamics with high signal-to-noise ratio. However, its spatial resolution is limited by the numerical aperture of the imaging objective and the scattering cross-section of molecules. To achieve super-resolved SRS imaging, we developed a deconvolution algorithm, adaptive moment estimation (Adam) optimization-based pointillism deconvolution (A-PoD) and demonstrated a spatial resolution of lower than 59 nm on the membrane of a single lipid droplet (LD). We applied A-PoD to spatially correlated multiphoton fluorescence imaging and deuterium oxide (D₂O)-probed SRS (DO-SRS) imaging from diverse samples to compare nanoscopic distributions of proteins and lipids in cells and subcellular organelles. We successfully differentiated newly synthesized lipids in LDs using A-PoD-coupled DO-SRS. The A-PoD-enhanced DO-SRS imaging method was also applied to reveal metabolic changes in brain samples from Drosophila on different diets. This new approach allows us to quantitatively measure the nanoscopic colocalization of biomolecules and metabolic dynamics in organelles.

2023-03-31

Senescent cell population with ZEB1 transcription factor as its main regulator promotes osteoarthritis in cartilage and meniscus

Swahn H, Li K, Duffy T, Olmer M, D'Lima DD, Mondala TS, Natarajan P, Head SR, Lotz MK

TMC-UConn/Scripps

Objectives: Single-cell level analysis of articular cartilage and meniscus tissues from human healthy and osteoarthritis (OA) knees. Methods: Single-cell RNA sequencing (scRNA-seq) analyses were performed on articular cartilage and meniscus tissues from healthy (n=6, n=7) and OA (n=6, n=6) knees. Expression of genes of interest was validated using immunohistochemistry and RNA-seq and function was analysed by gene overexpression and depletion. Results: scRNA-seq analyses of human knee articular cartilage (70 972 cells) and meniscus (78 017 cells) identified a pathogenic subset that is shared between both tissues. This cell population is expanded in OA and has strong OA and senescence gene signatures. Further, this subset has critical roles in extracellular matrix (ECM) and tenascin signalling and is the dominant sender of signals to all other cartilage and meniscus clusters and a receiver of TGFβ signalling. Fibroblast activating protein (FAP) is also a dysregulated gene in this cluster and promotes ECM degradation. Regulons that are controlled by transcription factor ZEB1 are shared between the pathogenic subset in articular cartilage and meniscus. In meniscus and cartilage cells, FAP and ZEB1 promote expression of genes that contribute to OA pathogenesis, including senescence. Conclusions: These single-cell studies identified a senescent pathogenic cell cluster that is present in cartilage and meniscus and has FAP and ZEB1 as main regulators which are novel and promising therapeutic targets for OA-associated pathways in both tissues.

2023-04-12

Foundation models for generalist medical artificial intelligence

Moor M, Banerjee O, Abad ZSH, Krumholz HM, Leskovec J, Topol EJ, Rajpurkar P.

TMC-Stanford

The exceptionally rapid development of highly flexible, reusable artificial intelligence (AI) models is likely to usher in newfound capabilities in medicine. We propose a new paradigm for medical AI, which we refer to as generalist medical AI (GMAI). GMAI models will be capable of carrying out a diverse set of tasks using very little or no task-specific labelled data. Built through self-supervision on large, diverse datasets, GMAI will flexibly interpret different combinations of medical modalities, including data from imaging, electronic health records, laboratory results, genomics, graphs or medical text. Models will in turn produce expressive outputs such as free-text explanations, spoken recommendations or image annotations that demonstrate advanced medical reasoning abilities. Here we identify a set of high-impact potential applications for GMAI and lay out specific technical capabilities and training datasets necessary to enable them. We expect that GMAI-enabled applications will challenge current strategies for regulating and validating AI devices for medicine and will shift practices associated with the collection of large medical datasets.

2023-04-13

High-Throughput Mass Spectrometry Imaging of Biological Systems: Current Approaches and Future Directions

Jiang LX, Yang M, Wali SN, Laskin J

TTD-Purdue

In the past two decades, the power of mass spectrometry imaging (MSI) for the label free spatial mapping of molecules in biological systems has been substantially enhanced by the development of approaches for imaging with high spatial resolution. With the increase in the spatial resolution, the experimental throughput has become a limiting factor for imaging of large samples with high spatial resolution and 3D imaging of tissues. Several experimental and computational approaches have been recently developed to enhance the throughput of MSI. In this critical review, we provide a succinct summary of the current approaches used to improve the throughput of MSI experiments. These approaches are focused on speeding up sampling, reducing the mass spectrometer acquisition time, and reducing the number of sampling locations. We discuss the rate-determining steps for different MSI methods and future directions in the development of high-throughput MSI techniques.

2023-04-19

Uncovering the spatial landscape of molecular interactions within the tumor microenvironment through latent spaces

Deshpande A, Loth M, Sidiropoulos DN, Zhang S, Yuan L, Bell ATF, Zhu Q, Ho WJ, Santa-Maria C, Gilkes DM, Williams SR, Uytingco CR, Chew J, Hartnett A, Bent ZW, Favorov AV, Popel AS, Yarchoan M, Kiemen A, Wu PH, Fujikura K, Wirtz D, Wood LD, Zheng L, Jaffee EM, Anders RA, Danilova L, Stein-O'Brien G, Kagohara LT, Fertig EJ

TMC-JHU

Recent advances in spatial transcriptomics (STs) enable gene expression measurements from a tissue sample while retaining its spatial context. This technology enables unprecedented in situ resolution of the regulatory pathways that underlie the heterogeneity in the tumor as well as the tumor microenvironment (TME). The direct characterization of cellular co-localization with spatial technologies facilities quantification of the molecular changes resulting from direct cell-cell interaction, as it occurs in tumor-immune interactions. We present SpaceMarkers, a bioinformatics algorithm to infer molecular changes from cell-cell interactions from latent space analysis of ST data. We apply this approach to infer the molecular changes from tumor-immune interactions in Visium spatial transcriptomics data of metastasis, invasive and precursor lesions, and immunotherapy treatment. Further transfer learning in matched scRNA-seq data enabled further quantification of the specific cell types in which SpaceMarkers are enriched. Altogether, SpaceMarkers can identify the location and context-specific molecular interactions within the TME from ST data.

2023-04-24

Using single cell atlas data to reconstruct regulatory networks

Song Q, Ruffalo M, Bar-Joseph Z

HIVE TC-CMU

Inference of global gene regulatory networks from omics data is a long-term goal of systems biology. Most methods developed for inferring transcription factor (TF)-gene interactions either relied on a small dataset or used snapshot data which is not suitable for inferring a process that is inherently temporal. Here, we developed a new computational method that combines neural networks and multi-task learning to predict RNA velocity rather than gene expression values. This allows our method to overcome many of the problems faced by prior methods leading to more accurate and more comprehensive set of identified regulatory interactions. Application of our method to atlas scale single cell data from 6 HuBMAP tissues led to several validated and novel predictions and greatly improved on prior methods proposed for this task.

2023-04-27

The HRA Organ Gallery Affords Immersive Superpowers for Building and Exploring the Human Reference Atlas with Virtual Reality

Bueckle A, Qing C, Luley S, Kumar Y, Pandey N, Börner K

HIVE MC-IU

The Human Reference Atlas (HRA, https://humanatlas.io) funded by the NIH Human Biomolecular Atlas Program (HuBMAP, https://commonfund.nih.gov/hubmap) and other projects engages 17 international consortia to create a spatial reference of the healthy adult human body at single-cell resolution. The specimen, biological structure, and spatial data that define the HRA are disparate in nature and benefit from a visually explicit method of data integration. Virtual reality (VR) offers unique means to enable users to explore complex data structures in a three-dimensional (3D) immersive environment. On a 2D desktop application, the 3D spatiality and real-world size of the 3D reference organs of the atlas is hard to understand. If viewed in VR, the spatiality of the organs and tissue blocks mapped to the HRA can be explored in their true size and in a way that goes beyond traditional 2D user interfaces. Added 2D and 3D visualizations can then provide data-rich context. In this paper, we present the HRA Organ Gallery, a VR application to explore the atlas in an integrated VR environment. Presently, the HRA Organ Gallery features 55 3D reference organs, 1,203 mapped tissue blocks from 292 demographically diverse donors and 15 providers that link to 6,000+ datasets; it also features prototype visualizations of cell type distributions and 3D protein structures. We outline our plans to support two biological use cases: on-ramping novice and expert users to HuBMAP data available via the Data Portal (https://portal.hubmapconsortium.org), and quality assurance/quality control (QA/QC) for HRA data providers. Code and onboarding materials are available at https://github.com/cns-iu/hra-organ-gallery-in-vr.

2023-04-28

Ten Years of Extracellular Matrix Proteomics: Accomplishments, Challenges, and Future Perspectives

Naba A

DP-Illinois

The extracellular matrix (ECM) is a complex assembly of hundreds of proteins forming the architectural scaffold of multicellular organisms. In addition to its structural role, the ECM conveys signals orchestrating cellular phenotypes. Alterations of ECM composition, abundance, structure, or mechanics have been linked to diseases and disorders affecting all physiological systems, including fibrosis and cancer. Deciphering the protein composition of the ECM and how it changes in pathophysiological contexts is thus the first step toward understanding the roles of the ECM in health and disease and toward the development of therapeutic strategies to correct disease-causing ECM alterations. Potentially, the ECM also represents a vast, yet untapped reservoir of disease biomarkers. ECM proteins are characterized by unique biochemical properties that have hindered their study: they are large, heavily and uniquely posttranslationally modified, and highly insoluble. Overcoming these challenges, we and others have devised mass-spectrometry–based proteomic approaches to define the ECM composition, or “matrisome,” of tissues. This first part of this review provides a historical overview of ECM proteomics research and presents the latest advances that now allow the profiling of the ECM of healthy and diseased tissues. The second part highlights recent examples illustrating how ECM proteomics has emerged as a powerful discovery pipeline to identify prognostic cancer biomarkers. The third part discusses remaining challenges limiting our ability to translate findings to clinical application and proposes approaches to overcome them. Lastly, the review introduces readers to resources available to facilitate the interpretation of ECM proteomics datasets. The ECM was once thought to be impenetrable. Mass spectrometry–based proteomics has proven to be a powerful tool to decode the ECM. In light of the progress made over the past decade, there are reasons to believe that the in-depth exploration of the matrisome is within reach and that we may soon witness the first translational application of ECM proteomics.

2023-04-28

Integrated Physiology of the Exocrine and Endocrine Compartments in Pancreatic Diseases: Workshop Proceedings

Mastracci TL, Apte M, Amundadottir LT, Alvarsson A, Artandi S, Bellin MD, Bernal-Mizrachi E, Caicedo A, Campbell-Thompson M, Cruz-Monserrate Z, El Ouaamari A, Gaulton KJ, Geisz A, Goodarzi MO, Hara M, Hull-Meichle RL, Kleger A, Klein AP, Kopp JL, Kulkarni RN, Muzumdar MD, Naren AP, Oakes SA, Olesen SS, Phelps EA, Powers AC, Stabler CL, Tirkes T, Whitcomb DC, Yadav D, Yong J, Zaghloul NA, Pandol SJ, Sander M

TMC-PNNL

The Integrated Physiology of the Exocrine and Endocrine Compartments in Pancreatic Diseases workshop was a 1.5-day scientific conference at the National Institutes of Health (Bethesda, MD) that engaged clinical and basic science investigators interested in diseases of the pancreas. This report provides a summary of the proceedings from the workshop. The goals of the workshop were to forge connections and identify gaps in knowledge that could guide future research directions. Presentations were segregated into six major theme areas, including 1) pancreas anatomy and physiology, 2) diabetes in the setting of exocrine disease, 3) metabolic influences on the exocrine pancreas, 4) genetic drivers of pancreatic diseases, 5) tools for integrated pancreatic analysis, and 6) implications of exocrine-endocrine cross talk. For each theme, multiple presentations were followed by panel discussions on specific topics relevant to each area of research; these are summarized here. Significantly, the discussions resulted in the identification of research gaps and opportunities for the field to address. In general, it was concluded that as a pancreas research community, we must more thoughtfully integrate our current knowledge of normal physiology as well as the disease mechanisms that underlie endocrine and exocrine disorders so that there is a better understanding of the interplay between these compartments.

2023-04-28

Spatial epigenome-transcriptome co-profiling of mammalian tissues

Zhang D, Deng Y, Kukanja P, Agirre E, Bartosovic M, Dong M, Ma C, Ma S, Su G, Bao S, Liu Y, Xiao Y, Rosoklija GB, Dwork AJ, Mann JJ, Leong KW, Boldrini M, Wang L, Haeussler M, Raphael BJ, Kluger Y, Castelo-Branco G, Fan R

TTD-Yale

Emerging spatial technologies, including spatial transcriptomics and spatial epigenomics, are becoming powerful tools for profiling of cellular states in the tissue context^1-5. However, current methods capture only one layer of omics information at a time, precluding the possibility of examining the mechanistic relationship across the central dogma of molecular biology. Here, we present two technologies for spatially resolved, genome-wide, joint profiling of the epigenome and transcriptome by cosequencing chromatin accessibility and gene expression, or histone modifications (H3K27me3, H3K27ac or H3K4me3) and gene expression on the same tissue section at near-single-cell resolution. These were applied to embryonic and juvenile mouse brain, as well as adult human brain, to map how epigenetic mechanisms control transcriptional phenotype and cell dynamics in tissue. Although highly concordant tissue features were identified by either spatial epigenome or spatial transcriptome we also observed distinct patterns, suggesting their differential roles in defining cell states. Linking epigenome to transcriptome pixel by pixel allows the uncovering of new insights in spatial epigenetic priming, differentiation and gene regulation within the tissue architecture. These technologies are of great interest in life science and biomedical research.

2023-05-01

Conundrums of Choice of 'Normal' Kidney Tissue for Single Cell Studies

Jain S

TMC-WUSTL

Purpose of review: Defining molecular changes in key kidney cell types across lifespan and in disease states is essential to understand the pathogenetic basis of disease progression and targeted therapies. Various single cell approaches are being applied to define disease associated molecular signatures. Key considerations include the choice of reference tissue or 'normal' for comparison to diseased human specimens and a benchmark reference atlas. We provide an overview of select single cell technologies, key considerations for experimental design, quality control, choices and challenges associated with assay type and source for reference tissue.

2023-05-01

Computational Pathology Fusing Spatial Technologies

Border S, Lucarelli N, Eadon MT, El-Achkar TM, Jain S, Sarder P

TMC-WUSTL

2023-05-05

Rapid Multivariate Analysis Approach to Explore Differential Spatial Protein Profiles in Tissue

Sharman K, Patterson NH, Weiss A, Neumann EK, Guiberson ER, Ryan DJ, Gutierrez DB, Spraggins JM, Van de Plas R, Skaar EP, Caprioli RM

TMC-Vanderbilt (Kidney)

Spatially targeted proteomics analyzes the proteome of specific cell types and functional regions within tissue. While spatial context is often essential to understanding biological processes, interpreting sub-region-specific protein profiles can pose a challenge due to the high-dimensional nature of the data. Here, we develop a multivariate approach for rapid exploration of differential protein profiles acquired from distinct tissue regions and apply it to analyze a published spatially targeted proteomics data set collected from Staphylococcus aureus-infected murine kidney, 4 and 10 days postinfection. The data analysis process rapidly filters high-dimensional proteomic data to reveal relevant differentiating species among hundreds to thousands of measured molecules. We employ principal component analysis (PCA) for dimensionality reduction of protein profiles measured by microliquid extraction surface analysis mass spectrometry. Subsequently, k-means clustering of the PCA-processed data groups samples by chemical similarity. Cluster center interpretation revealed a subset of proteins that differentiate between spatial regions of infection over two time points. These proteins appear involved in tricarboxylic acid metabolomic pathways, calcium-dependent processes, and cytoskeletal organization. Gene ontology analysis further uncovered relationships to tissue damage/repair and calcium-related defense mechanisms. Applying our analysis in infectious disease highlighted differential proteomic changes across abscess regions over time, reflecting the dynamic nature of host-pathogen interactions.

2023-05-10

A draft human pangenome reference

Liao WW, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas JK, Monlong J, Abel HJ, Buonaiuto S, Chang XH, Cheng H, Chu J, Colonna V, Eizenga JM, Feng X, Fischer C, Fulton RS, Garg S, Groza C, Guarracino A, Harvey WT, Heumos S, Howe K, Jain M, Lu TY, Markello C, Martin FJ, Mitchell MW, Munson KM, Mwaniki MN, Novak AM, Olsen HE, Pesout T, Porubsky D, Prins P, Sibbesen JA, Sirén J, Tomlinson C, Villani F, Vollger MR, Antonacci-Fulton LL, Baid G, Baker CA, Belyaeva A, Billis K, Carroll A, Chang PC, Cody S, Cook DE, Cook-Deegan RM, Cornejo OE, Diekhans M, Ebert P, Fairley S, Fedrigo O, Felsenfeld AL, Formenti G, Frankish A, Gao Y, Garrison NA, Giron CG, Green RE, Haggerty L, Hoekzema K, Hourlier T, Ji HP, Kenny EE, Koenig BA, Kolesnikov A, Korbel JO, Kordosky J, Koren S, Lee H, Lewis AP, Magalhães H, Marco-Sola S, Marijon P, McCartney A, McDaniel J, Mountcastle J, Nattestad M, Nurk S, Olson ND, Popejoy AB, Puiu D, Rautiainen M, Regier AA, Rhie A, Sacco S, Sanders AD, Schneider VA, Schultz BI, Shafin K, S

HIVE TC-CMU

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals¹. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.

2023-05-12

Bioorthogonal Chemical Imaging of Cell Metabolism Regulated by Aromatic Amino Acids

Bagheri P, Hoang K, Kuo CY, Trivedi H, Jang H, Shi L

TMC-WUSTL

Essential aromatic amino acids (AAAs) are building blocks for synthesizing new biomasses in cells and sustaining normal biological functions. For example, an abundant supply of AAAs is important for cancer cells to maintain their rapid growth and division. With this, there is a rising demand for a highly specific, noninvasive imaging approach with minimal sample preparation to directly visualize how cells harness AAAs for their metabolism in situ. Here, we develop an optical imaging platform that combines deuterium oxide (D2O) probing with stimulated Raman scattering (DO-SRS) and integrates DO-SRS with two-photon excitation fluorescence (2PEF) into a single microscope to directly visualize the metabolic activities of HeLa cells under AAA regulation. Collectively, the DO-SRS platform provides high spatial resolution and specificity of newly synthesized proteins and lipids in single HeLa cell units. In addition, the 2PEF modality can detect autofluorescence signals of nicotinamide adenine dinucleotide (NADH) and Flavin in a label-free manner. The imaging system described here is compatible with both in vitro and in vivo models, which is flexible for various experiments. The general workflow of this protocol includes cell culture, culture media preparation, cell synchronization, cell fixation, and sample imaging with DO-SRS and 2PEF modalities.

2023-05-15

Evaluation of cell segmentation methods without reference segmentations

Chen H, Murphy RF

HIVE TC-CMU

Cell segmentation is a cornerstone of many bioimage informatics studies, and inaccurate segmentation introduces error in downstream analysis. Evaluating segmentation results is thus a necessary step for developing segmentation methods as well as for choosing the most appropriate method for a particular type of sample. The evaluation process has typically involved comparison of segmentations with those generated by humans, which can be expensive and subject to unknown bias. We present here an approach to evaluating cell segmentation methods without relying upon comparison to results from humans. For this, we defined a number of segmentation quality metrics that can be applied to multichannel fluorescence images. We calculated these metrics for 14 previously described segmentation methods applied to datasets from four multiplexed microscope modalities covering five tissues. Using principal component analysis to combine the metrics, we defined an overall cell segmentation quality score and ranked the segmentation methods. We found that two deep learning-based methods performed the best overall, but that results for all methods could be significantly improved by postprocessing to ensure proper matching of cell and nuclear masks. Our evaluation tool is available as open source and all code and data are available in a Reproducible Research Archive.

2023-05-25

Dictionary learning for integrative, multimodal and scalable single-cell analysis

Hao Y, Stuart T, Kowalski MH, Choudhary S, Hoffman P, Hartman A, Srivastava A, Molla G, Madad S, Fernandez-Granda C, Satija R.

HIVE MC-NYGC

Mapping single-cell sequencing profiles to comprehensive reference datasets provides a powerful alternative to unsupervised analysis. However, most reference datasets are constructed from single-cell RNA-sequencing data and cannot be used to annotate datasets that do not measure gene expression. Here we introduce 'bridge integration', a method to integrate single-cell datasets across modalities using a multiomic dataset as a molecular bridge. Each cell in the multiomic dataset constitutes an element in a 'dictionary', which is used to reconstruct unimodal datasets and transform them into a shared space. Our procedure accurately integrates transcriptomic data with independent single-cell measurements of chromatin accessibility, histone modifications, DNA methylation and protein levels. Moreover, we demonstrate how dictionary learning can be combined with sketching techniques to improve computational scalability and harmonize 8.6 million human immune cell profiles from sequencing and mass cytometry experiments. Our approach, implemented in version 5 of our Seurat toolkit ( http://www.satijalab.org/seurat ), broadens the utility of single-cell reference datasets and facilitates comparisons across diverse molecular modalities.

2023-05-26

Ex Vivo OCT-Based Multimodal Imaging of Human Donor Eyes for Research into Age-Related Macular Degeneration

Messinger JD, Brinkmann M, Kimble JA, Berlin A, Freund KB, Grossman GH, Ach T, Curcio CA

TMC-Vanderbilt (Eye/pancreas)

A progression sequence for age-related macular degeneration (AMD) learned from optical coherence tomography (OCT)-based multimodal (MMI) clinical imaging could add prognostic value to laboratory findings. In this work, ex vivo OCT and MMI were applied to human donor eyes prior to retinal tissue sectioning. The eyes were recovered from non-diabetic white donors aged ≥80 years old, with a death-to-preservation time (DtoP) of ≤6 h. The globes were recovered on-site, scored with an 18 mm trephine to facilitate cornea removal, and immersed in buffered 4% paraformaldehyde. Color fundus images were acquired after anterior segment removal with a dissecting scope and an SLR camera using trans-, epi-, and flash illumination at three magnifications. The globes were placed in a buffer within a custom-designed chamber with a 60 diopter lens. They were imaged with spectral domain OCT (30° macula cube, 30 µm spacing, averaging = 25), near-infrared reflectance, 488 nm autofluorescence, and 787 nm autofluorescence. The AMD eyes showed a change in the retinal pigment epithelium (RPE), with drusen or subretinal drusenoid deposits (SDDs), with or without neovascularization, and without evidence of other causes. Between June 2016 and September 2017, 94 right eyes and 90 left eyes were recovered (DtoP: 3.9 ± 1.0 h). Of the 184 eyes, 40.2% had AMD, including early intermediate (22.8%), atrophic (7.6%), and neovascular (9.8%) AMD, and 39.7% had unremarkable maculas. Drusen, SDDs, hyper-reflective foci, atrophy, and fibrovascular scars were identified using OCT. Artifacts included tissue opacification, detachments (bacillary, retinal, RPE, choroidal), foveal cystic change, an undulating RPE, and mechanical damage. To guide the cryo-sectioning, OCT volumes were used to find the fovea and optic nerve head landmarks and specific pathologies. The ex vivo volumes were registered with the in vivo volumes by selecting the reference function for eye tracking. The ex vivo visibility of the pathology seen in vivo depends on the preservation quality. Within 16 months, 75 rapid DtoP donor eyes at all stages of AMD were recovered and staged using clinical MMI methods.

2023-06-03

PodoCount: A Robust, Fully Automated, Whole-Slide Podocyte Quantification Tool

Santo BA, Govind D, Daneshpajouhnejad P, Yang X, Wang XX, Myakala K, Jones BA, Levi M, Kopp JB, Yoshida T, Niedernhofer LJ, Manthey D, Moon KC, Han SS, Zee J, Rosenberg AZ, Sarder P

TMC-UCSD

Introduction: Podocyte depletion is a histomorphologic indicator of glomerular injury and predicts clinical outcomes. Podocyte estimation methods or podometrics are semiquantitative, technically involved, and laborious. Implementation of high-throughput podometrics in experimental and clinical workflows necessitates an automated podometrics pipeline. Recognizing that computational image analysis offers a robust approach to study cell and tissue structure, we developed and validated PodoCount (a computational tool for automated podocyte quantification in immunohistochemically labeled tissues) using a diverse data set. Methods: Whole-slide images (WSIs) of tissues immunostained with a podocyte nuclear marker and periodic acid-Schiff counterstain were acquired. The data set consisted of murine whole kidney sections (n = 135) from 6 disease models and human kidney biopsy specimens from patients with diabetic nephropathy (DN) (n = 45). Within segmented glomeruli, podocytes were extracted and image analysis was applied to compute measures of podocyte depletion and nuclear morphometry. Computational performance evaluation and statistical testing were performed to validate podometric and associated image features. PodoCount was disbursed as an open-source, cloud-based computational tool. Results: PodoCount produced highly accurate podocyte quantification when benchmarked against existing methods. Podocyte nuclear profiles were identified with 0.98 accuracy and segmented with 0.85 sensitivity and 0.99 specificity. Errors in podocyte count were bounded by 1 podocyte per glomerulus. Podocyte-specific image features were found to be significant predictors of disease state, proteinuria, and clinical outcome. Conclusion: PodoCount offers high-performance podocyte quantitation in diverse murine disease models and in human kidney biopsy specimens. Resultant features offer significant correlation with associated metadata and outcome. Our cloud-based tool will provide end users with a standardized approach for automated podometrics from gigapixel-sized WSIs.

2023-06-20

The expanding vistas of spatial transcriptomics

Tian L, Chen F, Macosko EZ

RTI-Broad

The formation and maintenance of tissue integrity requires complex, coordinated activities by thousands of genes and their encoded products. Until recently, transcript levels could only be quantified for a few genes in tissues, but advances in DNA sequencing, oligonucleotide synthesis and fluorescence microscopy have enabled the invention of a suite of spatial transcriptomics technologies capable of measuring the expression of many, or all, genes in situ. These technologies have evolved rapidly in sensitivity, multiplexing and throughput. As such, they have enabled the determination of the cell-type architecture of tissues, the querying of cell-cell interactions and the monitoring of molecular interactions between tissue components. The rapidly evolving spatial genomics landscape will enable generalized high-throughput genomic measurements and perturbations to be performed in the context of tissues. These advances will empower hypothesis generation and biological discovery and bridge the worlds of tissue biology and genomics.

2023-06-20

Systems biology approaches to unravel lymphocyte subsets and function

Kim Y, Greenleaf WJ, Bendall SC

TMC-BIDMC

Single-cell technologies have revealed the extensive heterogeneity and complexity of the immune system. Systems biology approaches in immunology have taken advantage of the high-parameter, high-throughput data and analyzed immune cell types in a 'bottom-up' data-driven method. This approach has discovered previously unrecognized cell types and functions. Especially for human immunology, in which experimental manipulations are challenging, systems approach has become a successful means to investigate physiologically relevant contexts. This review focuses on the recent findings in lymphocyte biology, from their development, differentiation into subsets, and heterogeneity in their functions, enabled by these systems approaches. Furthermore, we review examples of the application of findings from systems approach studies and discuss how now to leave the rich dataset in the curse of high dimensionality.

2023-06-20

Prospective validation of diffusion-weighted MRI as a biomarker of tumor response and oncologic outcomes in head and neck cancer: Results from an observational biomarker pre-qualification study

Joint Head and Neck Radiotherapy-MRI Development Cooperative; Mohamed ASR, Abusaif A, He R, Wahid KA, Salama V, Youssef S, McDonald BA, Naser M, Ding Y, Salzillo TC, AboBakr MA, Wang J, Lai SY, Fuller CD

HIVE IEC-PSC

Purpose: To determine DWI parameters associated with tumor response and oncologic outcomes in head and neck (HNC) patients treated with radiotherapy (RT). Methods: HNC patients in a prospective study were included. Patients had MRIs pre-, mid-, and post-RT completion. We used T2-weighted sequences for tumor segmentation which were co-registered to respective DWIs for extraction of apparent diffusion coefficient (ADC) measurements. Treatment response was assessed at mid- and post-RT and was defined as: complete response (CR) vs. non-complete response (non-CR). The Mann-Whitney U test was used to compare ADC between CR and non-CR. Recursive partitioning analysis (RPA) was performed to identify ADC threshold associated with relapse. Cox proportional hazards models were done for clinical vs. clinical and imaging parameters and internal validation was done using bootstrapping technique. Results: Eighty-one patients were included. Median follow-up was 31 months. For patients with post-RT CR, there was a significant increase in mean ADC at mid-RT compared to baseline ((1.8 ± 0.29) × 10^-3 mm²/s vs. (1.37 ± 0.22) × 10^-3 mm²/s, p < 0.0001), while patients with non-CR had no significant increase (p > 0.05). RPA identified GTV-P delta (Δ)ADC_mean < 7% at mid-RT as the most significant parameter associated with worse LC and RFS (p = 0.01). Uni- and multi-variable analysis showed that GTV-P ΔADC_mean at mid-RT ≥ 7% was significantly associated with better LC and RFS. The addition of ΔADC_mean significantly improved the c-indices of LC and RFS models compared with standard clinical variables (0.85 vs. 0.77 and 0.74 vs. 0.68 for LC and RFS, respectively, p < 0.0001 for both). Conclusion: ΔADC_mean at mid-RT is a strong predictor of oncologic outcomes in HNC. Patients with no significant increase of primary tumor ADC at mid-RT are at high risk of disease relapse.

2023-06-23

Multi-omic longitudinal study reveals immune correlates of clinical course among hospitalized COVID-19 patients

Diray-Arce J, Fourati S, Doni Jayavelu N, Patel R, Maguire C, Chang AC, Dandekar R, Qi J, Lee BH, van Zalm P, Schroeder A, Chen E, Konstorum A, Brito A, Gygi JP, Kho A, Chen J, Pawar S, Gonzalez-Reiche AS, Hoch A, Milliren CE, Overton JA, Westendorf K, IMPACC Network; Cairns CB, Rouphael N, Bosinger SE, Kim-Schulze S, Krammer F, Rosen L, Grubaugh ND, van Bakel H, Wilson M, Rajan J, Steen H, Eckalbar W, Cotsapas C, Langelier CR, Levy O, Altman MC, Maecker H, Montgomery RR, Haddad EK, Sekaly RP, Esserman D, Ozonoff A, Becker PM, Augustine AD, Guan L, Peters B, Kleinstein SH

TMC-Florida

The IMPACC cohort, composed of >1,000 hospitalized COVID-19 participants, contains five illness trajectory groups (TGs) during acute infection (first 28 days), ranging from milder (TG1-3) to more severe disease course (TG4) and death (TG5). Here, we report deep immunophenotyping, profiling of >15,000 longitudinal blood and nasal samples from 540 participants of the IMPACC cohort, using 14 distinct assays. These unbiased analyses identify cellular and molecular signatures present within 72 h of hospital admission that distinguish moderate from severe and fatal COVID-19 disease. Importantly, cellular and molecular states also distinguish participants with more severe disease that recover or stabilize within 28 days from those that progress to fatal outcomes (TG4 vs. TG5). Furthermore, our longitudinal design reveals that these biologic states display distinct temporal patterns associated with clinical outcomes. Characterizing host immune responses in relation to heterogeneity in disease course may inform clinical prognosis and opportunities for intervention.

2023-06-27

Microtechnologies for single-cell and spatial multi-omics

Deng Y, Bai Z, Fan R

TTD-Yale

Single-cell omics assays allow the identification of the type, subtype and functional state of a single cell. To put such single-cell data in the context of tissues, spatially resolved omics can be applied to quantify gene expression and regulation in intact tissues at the genome scale. However, to obtain a full picture of gene regulatory networks in a cell, multi-omic assays are required that can assess two or more modalities of omics information. In this Review, we discuss microfabricated systems that can be engineered to isolate, probe, manipulate and process single cells at the micrometre scale for single-cell and spatial multi-omics studies. We outline microchannel-, microarray- and droplet-based microfluidic platforms, examining their application in multimodal cellular profiling at the cellular and subcellular level. Finally, we discuss the key challenges that need to be addressed to advance the translation and commercialization of such microchip-based technologies for fundamental research and medical applications.

2023-06-28

Unsupervised cell functional annotation for single-cell RNA-seq

Li D, Ding J, Bar-Joseph Z

HIVE TC-CMU

One of the first steps in the analysis of single-cell RNA sequencing (scRNA-seq) data is the assignment of cell types. Although a number of supervised methods have been developed for this, in most cases such assignment is performed by first clustering cells in low-dimensional space and then assigning cell types to different clusters. To overcome noise and to improve cell type assignments, we developed UNIFAN, a neural network method that simultaneously clusters and annotates cells using known gene sets. UNIFAN combines both low-dimensional representation for all genes and cell-specific gene set activity scores to determine the clustering. We applied UNIFAN to human and mouse scRNA-seq data sets from several different organs. We show, by using knowledge about gene sets, that UNIFAN greatly outperforms prior methods developed for clustering scRNA-seq data. The gene sets assigned by UNIFAN to different clusters provide strong evidence for the cell type that is represented by this cluster, making annotations easier.

2023-06-29

Quantifying radiation in the axillary bed at the site of lymphedema surgical prevention

Friedman R, Spiegel DY, Kinney J, Willcox J, Recht A, Singhal D

TMC-BIDMC

Purpose: Immediate lymphatic reconstruction (ILR) is a procedure known to reduce the risk of lymphedema in patients undergoing axillary lymph node dissection (ALND). However, patients who receive adjuvant radiotherapy are at increased risk of lymphedema. The aim of this study was to quantify the extent of radiation at the site of surgical prevention. Methods: We recently began deploying clips at the site of ILR to identify the site during radiation planning. A retrospective review was performed to identify breast cancer patients who underwent ILR with clip deployment and adjuvant radiation therapy from October 2020 to April 2022. Patients were excluded if they had not completed radiotherapy. The exposure and dose of radiation received by the site was determined and recorded. Results: In a cohort of 11 patients, the site fell within the radiation field in 7 patients (64%) and received a median dose of 4280 cGy. Among these 7 patients, 3 had sites located within tissue considered at risk of oncologic recurrence and the remaining 4 sites received radiation from a tangential field treating the breast or chest wall. The median dose to the ILR site for the 4 patients whose sites were outside the radiation fields was 233 cGy. Conclusion: Our findings suggest that even when the site of surgical prevention was not within the targeted radiation field during treatment planning, it remains susceptible to radiation. Strategies for limiting radiation at this site are needed.

2023-06-29

Polygenic prediction of preeclampsia and gestational hypertension

Honigberg MC, Truong B, Khan RR, Xiao B, Bhatta L, Vy HMT, Guerrero RF, Schuermans A, Selvaraj MS, Patel AP, Koyama S, Cho SMJ, Vellarikkal SK, Trinder M, Urbut SM, Gray KJ, Brumpton BM, Patil S, Zöllner S, Antopia MC, Saxena R, Nadkarni GN, Do R, Yan Q, Pe'er I, Verma SS, Gupta RM, Haas DM, Martin HC, van Heel DA, Laisk T, Natarajan P

DP-Harvard

Preeclampsia and gestational hypertension are common pregnancy complications associated with adverse maternal and child outcomes. Current tools for prediction, prevention and treatment are limited. Here we tested the association of maternal DNA sequence variants with preeclampsia in 20,064 cases and 703,117 control individuals and with gestational hypertension in 11,027 cases and 412,788 control individuals across discovery and follow-up cohorts using multi-ancestry meta-analysis. Altogether, we identified 18 independent loci associated with preeclampsia/eclampsia and/or gestational hypertension, 12 of which are new (for example, MTHFR-CLCN6, WNT3A, NPR3, PGR and RGL3), including two loci (PLCE1 and FURIN) identified in the multitrait analysis. Identified loci highlight the role of natriuretic peptide signaling, angiogenesis, renal glomerular function, trophoblast development and immune dysregulation. We derived genome-wide polygenic risk scores that predicted preeclampsia/eclampsia and gestational hypertension in external cohorts, independent of clinical risk factors, and reclassified eligibility for low-dose aspirin to prevent preeclampsia. Collectively, these findings provide mechanistic insights into the hypertensive disorders of pregnancy and have the potential to advance pregnancy risk stratification.

2023-06-30

Deriving spatial features from in situ proteomics imaging to enhance cancer survival analysis

Dayao MT, Trevino A, Kim H, Ruffalo M, D'Angio HB, Preska R, Duvvuri U, Mayer AT, Bar-Joseph Z

HIVE TC-CMU

Motivation: Spatial proteomics data have been used to map cell states and improve our understanding of tissue organization. More recently, these methods have been extended to study the impact of such organization on disease progression and patient survival. However, to date, the majority of supervised learning methods utilizing these data types did not take full advantage of the spatial information, impacting their performance and utilization. Results: Taking inspiration from ecology and epidemiology, we developed novel spatial feature extraction methods for use with spatial proteomics data. We used these features to learn prediction models for cancer patient survival. As we show, using the spatial features led to consistent improvement over prior methods that used the spatial proteomics data for the same task. In addition, feature importance analysis revealed new insights about the cell interactions that contribute to patient survival. Availability and implementation: The code for this work can be found at gitlab.com/enable-medicine-public/spatsurv.

2023-07-05

Tissue Mass Spectrometry: How Solid Is Our Future?

Unsihuay D, Phipps WS, Paulovich AG, Chapman JR, Ducret A, Eberlin LS, Spraggins JM, Goodwin RJA

TMC-Vanderbilt (Kidney)

2023-07-10

SCS: cell segmentation for high-resolution spatial transcriptomics

Chen H, Li D, Bar-Joseph Z

HIVE TC-CMU

Spatial transcriptomics promises to greatly improve our understanding of tissue organization and cell-cell interactions. While most current platforms for spatial transcriptomics only offer multi-cellular resolution, with 10-15 cells per spot, recent technologies provide a much denser spot placement leading to subcellular resolution. A key challenge for these newer methods is cell segmentation and the assignment of spots to cells. Traditional image-based segmentation methods are limited and do not make full use of the information profiled by spatial transcriptomics. Here we present subcellular spatial transcriptomics cell segmentation (SCS), which combines imaging data with sequencing data to improve cell segmentation accuracy. SCS assigns spots to cells by adaptively learning the position of each spot relative to the center of its cell using a transformer neural network. SCS was tested on two new subcellular spatial transcriptomics technologies and outperformed traditional image-based segmentation methods. SCS achieved better accuracy, identified more cells and provided more realistic cell size estimation. Subcellular analysis of RNAs using SCS spot assignments provides information on RNA localization and further supports the segmentation results.

2023-07-19

Organization of the Human Intestine at Single Cell Resolution

Hickey JW, Becker WR, Nevins SA, Horning A, Perez AE, Zhu C, Zhu B, Wei B, Chiu R, Chen DC, Cotter DL, Esplin ED, Weimer AK, Caraccio C, Venkataraaman V, Schürch CM, Black S, Brbić M, Cao K, Chen S, Zhang W, Monte E, Zhang NR, Ma Z, Leskovec J, Zhang Z, Lin S, Longacre T, Plevritis SK, Lin Y, Nolan GP, Greenleaf WJ, Snyder M

TMC-Stanford

The intestine is a complex organ that promotes digestion, extracts nutrients, participates in immune surveillance, maintains critical symbiotic relationships with microbiota and affects overall health¹. The intesting has a length of over nine metres, along which there are differences in structure and function². The localization of individual cell types, cell type development trajectories and detailed cell transcriptional programs probably drive these differences in function. Here, to better understand these differences, we evaluated the organization of single cells using multiplexed imaging and single-nucleus RNA and open chromatin assays across eight different intestinal sites from nine donors. Through systematic analyses, we find cell compositions that differ substantially across regions of the intestine and demonstrate the complexity of epithelial subtypes, and find that the same cell types are organized into distinct neighbourhoods and communities, highlighting distinct immunological niches that are present in the intestine. We also map gene regulatory differences in these cells that are suggestive of a regulatory differentiation cascade, and associate intestinal disease heritability with specific cell types. These results describe the complexity of the cell composition, regulation and organization for this organ, and serve as an important reference map for understanding human biology and disease.

2023-07-19

Segmentation of human functional tissue units in support of a Human Reference Atlas

Jain Y, Godwin LL, Ju Y, Sood N, Quardokus EM, Bueckle A, Longacre T, Horning A, Lin Y, Esplin ED, Hickey JW, Snyder MP, Patterson NH, Spraggins JM, Börner K

HIVE MC-IU

The Human BioMolecular Atlas Program (HuBMAP) aims to compile a Human Reference Atlas (HRA) for the healthy adult body at the cellular level. Functional tissue units (FTUs), relevant for HRA construction, are of pathobiological significance. Manual segmentation of FTUs does not scale; highly accurate and performant, open-source machine-learning algorithms are needed. We designed and hosted a Kaggle competition that focused on development of such algorithms and 1200 teams from 60 countries participated. We present the competition outcomes and an expanded analysis of the winning algorithms on additional kidney and colon tissue data, and conduct a pilot study to understand spatial location and density of FTUs across the kidney. The top algorithm from the competition, Tom, outperforms other algorithms in the expanded study, while using fewer computational resources. Tom was added to the HuBMAP infrastructure to run kidney FTU segmentation at scale-showcasing the value of Kaggle competitions for advancing research.

2023-07-19

3D reconstruction of skin and spatial mapping of immune cell density, vascular distance and effects of sun exposure and aging

Ghose S, Ju Y, McDonough E, Ho J, Karunamurthy A, Chadwick C, Cho S, Rose R, Corwin A, Surrette C, Martinez J, Williams E, Sood A, Al-Kofahi Y, Falo LD Jr, Börner K, Ginty F

TMC-GE Global

Mapping the human body at single cell resolution in three dimensions (3D) is important for understanding cellular interactions in context of tissue and organ organization. 2D spatial cell analysis in a single tissue section may be limited by cell numbers and histology. Here we show a workflow for 3D reconstruction of multiplexed sequential tissue sections: MATRICS-A (Multiplexed Image Three-D Reconstruction and Integrated Cell Spatial - Analysis). We demonstrate MATRICS-A in 26 serial sections of fixed skin (stained with 18 biomarkers) from 12 donors aged between 32-72 years. Comparing the 3D reconstructed cellular data with the 2D data, we show significantly shorter distances between immune cells and vascular endothelial cells (56 µm in 3D vs 108 µm in 2D). We also show 10-70% more T cells (total) within 30 µm of a neighboring T helper cell in 3D vs 2D. Distances of p53, DDB2 and Ki67 positive cells to the skin surface were consistent across all ages/sun exposure and largely localized to the lower stratum basale layer of the epidermis. MATRICS-A provides a framework for analysis of 3D spatial cell relationships in healthy and aging organs and could be further extended to diseased organs.

2023-07-19

Segmenting functional tissue units across human organs using community-driven development of generalizable machine learning algorithms

Börner K

HIVE MC-IU

2023-07-19

Anatomical structures, cell types, and biomarkers of the healthy human blood vasculature

Boppana A, Lee S, Malhotra R, Halushka M, Gustilo KS, Quardokus EM, Herr BW 2nd, Börner K, Weber GM

HIVE MC-IU

More than 150 scientists from 17 consortia are collaborating on an international project to build a Human Reference Atlas, which maps all 37 trillion cells in the healthy adult human body. The initial release of this atlas provided hierarchical lists of the anatomical structures, cell types, and biomarkers in 11 organs. Here, we describe the methods we used as part of this initiative to build the first open, computer-readable, and comprehensive database of the adult human blood vasculature, called the Human Reference Atlas-Vasculature Common Coordinate Framework (HRA-VCCF). It includes 993 vessels and their branching connections, 10 cell types, and 10 biomarkers. With this paper we are releasing additional details on vessel types and subtypes, branching sequence, anastomoses, portal systems, microvasculature, functional tissue units, mappings to regions vessels supply or drain, geometric properties of vessels, and links to 3D reference objects. Future versions will add variants and connections to the lymph vasculature; and, it will iteratively expand and improve the database as additional experimental data become available through the participating consortia.

2023-07-19

A spatially resolved timeline of the human maternal-fetal interface

Greenbaum S, Averbukh I, Soon E, Rizzuto G, Baranski A, Greenwald NF, Kagel A, Bosse M, Jaswa EG, Khair Z, Kwok S, Warshawsky S, Piyadasa H, Goldston M, Spence A, Miller G, Schwartz M, Graf W, Van Valen D, Winn VD, Hollmann T, Keren L, van de Rijn M, Angelo M

TMC-Stanford (Bone marrow)

Beginning in the first trimester, fetally derived extravillous trophoblasts (EVTs) invade the uterus and remodel its spiral arteries, transforming them into large, dilated blood vessels. Several mechanisms have been proposed to explain how EVTs coordinate with the maternal decidua to promote a tissue microenvironment conducive to spiral artery remodelling (SAR)^1-3. However, it remains a matter of debate regarding which immune and stromal cells participate in these interactions and how this evolves with respect to gestational age. Here we used a multiomics approach, combining the strengths of spatial proteomics and transcriptomics, to construct a spatiotemporal atlas of the human maternal-fetal interface in the first half of pregnancy. We used multiplexed ion beam imaging by time-of-flight and a 37-plex antibody panel to analyse around 500,000 cells and 588 arteries within intact decidua from 66 individuals between 6 and 20 weeks of gestation, integrating this dataset with co-registered transcriptomics profiles. Gestational age substantially influenced the frequency of maternal immune and stromal cells, with tolerogenic subsets expressing CD206, CD163, TIM-3, galectin-9 and IDO-1 becoming increasingly enriched and colocalized at later time points. By contrast, SAR progression preferentially correlated with EVT invasion and was transcriptionally defined by 78 gene ontology pathways exhibiting distinct monotonic and biphasic trends. Last, we developed an integrated model of SAR whereby invasion is accompanied by the upregulation of pro-angiogenic, immunoregulatory EVT programmes that promote interactions with the vascular endothelium while avoiding the activation of maternal immune cells.

2023-07-19

Organ Mapping Antibody Panels (OMAPs): A community resource for standardized multiplexed tissue imaging

Quardokus EM, Saunders DC, McDonough E, Hickey JW, Werlein C, Surrette C, Rajbhandari P, Casals AM, Tian H, Lowery L, Neumann EK, Björklund F, Neelakantan TV, Croteau J, Wiblin AE, Fisher J, Livengood AJ, Dowell KG, Silverstein JC, Spraggins JM, Pryhuber GS, Deutsch G, Ginty F, Nolan GP, Melov S, Jonigk D, Caldwell MA, Vlachos IS, Muller W, Gehlenborg N, Stockwell BR, Lundberg E, Snyder MP, Germain RN, Camarillo JM, Kelleher NL, Börner K, Radtke AJ

Consortium

Multiplexed antibody-based imaging enables the detailed characterization of molecular and cellular organization in tissues. Advances in the field now allow high-parameter data collection (>60 targets); however, considerable expertise and capital are needed to construct the antibody panels employed by these methods. Organ mapping antibody panels are community-validated resources that save time and money, increase reproducibility, accelerate discovery and support the construction of a Human Reference Atlas.

2023-07-19

An atlas of healthy and injured cell states and niches in the human kidney

Lake BB, Menon R, Winfree S, Hu Q, Ferreira RM, Kalhor K, Barwinska D, Otto EA, Ferkowicz M, Diep D, Plongthongkum N, Knoten A, Urata S, Mariani LH, Naik AS, Eddy S, Zhang B, Wu Y, Salamon D, Williams JC, Wang X, Balderrama KS, Hoover PJ, Murray E, Marshall JL, Noel T, Vijayan A, Hartman A, Chen F, Waikar SS, Rosas SE, Wilson FP, Palevsky PM, Kiryluk K, Sedor JR, Toto RD, Parikh CR, Kim EH, Satija R, Greka A, Macosko EZ, Kharchenko PV, Gaut JP, Hodgin JB; KPMP Consortium; Eadon MT, Dagher PC, El-Achkar TM, Zhang K, Kretzler M, Jain S

Consortium

Understanding kidney disease relies on defining the complexity of cell types and states, their associated molecular profiles and interactions within tissue neighbourhoods¹. Here we applied multiple single-cell and single-nucleus assays (>400,000 nuclei or cells) and spatial imaging technologies to a broad spectrum of healthy reference kidneys (45 donors) and diseased kidneys (48 patients). This has provided a high-resolution cellular atlas of 51 main cell types, which include rare and previously undescribed cell populations. The multi-omic approach provides detailed transcriptomic profiles, regulatory factors and spatial localizations spanning the entire kidney. We also define 28 cellular states across nephron segments and interstitium that were altered in kidney injury, encompassing cycling, adaptive (successful or maladaptive repair), transitioning and degenerative states. Molecular signatures permitted the localization of these states within injury neighbourhoods using spatial transcriptomics, while large-scale 3D imaging analysis (around 1.2 million neighbourhoods) provided corresponding linkages to active immune responses. These analyses defined biological pathways that are relevant to injury time-course and niches, including signatures underlying epithelial repair that predicted maladaptive states associated with a decline in kidney function. This integrated multimodal spatial cell atlas of healthy and diseased human kidneys represents a comprehensive benchmark of cellular states, neighbourhoods, outcome-associated signatures and publicly available interactive visualizations.

2023-07-19

A spatially anchored transcriptomic atlas of the human kidney papilla identifies significant immune injury in patients with stone disease

Canela VH, Bowen WS, Ferreira RM, Syed F, Lingeman JE, Sabo AR, Barwinska D, Winfree S, Lake BB, Cheng YH, Gaut JP, Ferkowicz M, LaFavers KA, Zhang K, Coe FL, Worcester E; Kidney Precision Medicine Project; Jain S, Eadon MT, Williams JC Jr, El-Achkar TM

Consortium

Kidney stone disease causes significant morbidity and increases health care utilization. In this work, we decipher the cellular and molecular niche of the human renal papilla in patients with calcium oxalate (CaOx) stone disease and healthy subjects. In addition to identifying cell types important in papillary physiology, we characterize collecting duct cell subtypes and an undifferentiated epithelial cell type that was more prevalent in stone patients. Despite the focal nature of mineral deposition in nephrolithiasis, we uncover a global injury signature characterized by immune activation, oxidative stress and extracellular matrix remodeling. We also identify the association of MMP7 and MMP9 expression with stone disease and mineral deposition, respectively. MMP7 and MMP9 are significantly increased in the urine of patients with CaOx stone disease, and their levels correlate with disease activity. Our results define the spatial molecular landscape and specific pathways contributing to stone-mediated injury in the human papilla and identify associated urinary biomarkers.

2023-07-19

Advances and Prospects for the Human BioMolecular Atlas Program (HuBMAP)

Jain S, Pei L, Spraggins JM, Angelo M, Carson JP, Gehlenborg N, Ginty F, Gonçalves JP, Hagood JS, Hickey JW, Kelleher NL, Laurent LC, Lin S, Lin Y, Liu H, Naba A, Nakayasu ES, Qian WJ, Radtke A, Robson P, Stockwell BR, Van de Plas R, Vlachos IS, Zhou M; HuBMAP Consortium; Börner K, Snyder MP

Consortium

The Human BioMolecular Atlas Program (HuBMAP) aims to create a multi-scale spatial atlas of the healthy human body at single-cell resolution by applying advanced technologies and disseminating resources to the community. As the HuBMAP moves past its first phase, creating ontologies, protocols and pipelines, this Perspective introduces the production phase: the generation of reference spatial maps of functional tissue units across many organs from diverse populations and the creation of mapping tools and infrastructure to advance biomedical research.

2023-07-19

Multimodal mass spectrometry imaging reveals molecular, cellular and structural organization of mammalian liver at single-cell resolution

PREPRINT

TTD-Columbia/Penn State

2023-07-22

Non-Linear Lymphatic Anatomy in Breast Cancer Patients Prior to Axillary Lymph Node Dissection: A Risk Factor For Lymphedema Development

Kinney JR, Friedman R, Kim E, Tillotson E, Shillue K, Lee BT, Singhal D

TMC-BIDMC

Immediate lymphatic reconstruction (ILR) at the time of axillary lymph node dissection (ALND) has become increasingly utilized for the prevention of breast cancer related lymphedema. Preoperative indocyanine green (ICG) lymphography is routinely performed prior to an ILR procedure to characterize baseline lymphatic anatomy of the upper extremity. While most patients have linear lymphatic channels visualized on ICG, representing a non-diseased state, some patients demonstrate non-linear patterns. This study aims to determine potential inciting factors that help explain why some patients have non-linear patterns, and what these patterns represent regarding the relative risk of developing postoperative breast cancer related lymphedema in this population. A retrospective review was conducted to identify breast cancer patients who underwent successful ILR with preoperative ICG at our institution from November 2017-June 2022. Among the 248 patients who were identified, 13 (5%) had preoperative non-linear lymphatic anatomy. A history of trauma or surgery of the affected limb and an increasing number of sentinel lymph nodes removed prior to ALND appeared to be risk factors for non-linear lymphatic anatomy. Furthermore, non-linear anatomy in the limb of interest was associated with an increased risk of postoperative lymphedema development. Overall, non-linear lymphatic anatomy on pre-operative ICG lymphography appears to be a risk factor for developing ipsilateral breast cancer-related lymphedema. Guided by the study's findings, when breast cancer patients present with baseline non-linear lymphatic anatomy, our institution has implemented a protocol of prophylactically prescribing compression sleeves immediately following ALND.

2023-07-26

Edematous Dermal Thickening on Magnetic Resonance Imaging as a Biomarker for Lymphatic Surgical Outcomes

Kinney JR, Babapour S, Kim E, Friedman R, Singhal D, Lee BT, Tsai LL

TMC-BIDMC

Background and Objectives: One of the surgical treatments for breast cancer-related lymphedema (BCRL) is debulking lipectomy. The aim of this study is to investigate whether dermal thickness could be utilized as an objective indicator of post-operative changes following debulking. Materials and Methods: A retrospective review of BCRL patients who underwent debulking lipectomy was conducted. MRI-based dermal thickness was measured by two separate trained readers at 16 regions of the upper extremity. Pre- and post-operative reduction in dermal thickness was compared across the affected and unaffected (control) arms for each patient. The Wilcoxon rank sum test was used to assess for significant change. Univariate linear regression was used to assess the relationship between dermal thickness reduction and changes to LYMPH-Q scores, L-Dex scores, and relative volume change. Results: Seventeen patients were included in our analysis. There was significant reduction in dermal thickness at 5/16 regions in the affected arm. Dermal thickness change was significantly correlated with LYMPH-Q scores, L-Dex scores, and relative volume change in 2/16 limb compartments. There was predominant dermal thickening in the dorsal compartment of the upper arm and in the ventral and ulnar compartments of the forearm. Conclusions: Dermal thickness shows promising utility in tracking post-operative debulking procedures for breast cancer-related lymphedema. Further studies with larger patient populations and a variety of imaging modalities are required to continue to develop a clinically objective and reproducible method of post-surgical lymphedema staging and monitoring.

2023-07-31

KRAS(G12D) drives lepidic adenocarcinoma through stem-cell reprogramming

Juul NH, Yoon JK, Martinez MC, Rishi N, Kazadaeva YI, Morri M, Neff NF, Trope WL, Shrager JB, Sinha R, Desai TJ

TTD-Stanford

Many cancers originate from stem or progenitor cells hijacked by somatic mutations that drive replication, exemplified by adenomatous transformation of pulmonary alveolar epithelial type II (AT2) cells¹. Here we demonstrate a different scenario: expression of KRAS(G12D) in differentiated AT1 cells reprograms them slowly and asynchronously back into AT2 stem cells that go on to generate indolent tumours. Like human lepidic adenocarcinoma, the tumour cells slowly spread along alveolar walls in a non-destructive manner and have low ERK activity. We find that AT1 and AT2 cells act as distinct cells of origin and manifest divergent responses to concomitant WNT activation and KRAS(G12D) induction, which accelerates AT2-derived but inhibits AT1-derived adenoma proliferation. Augmentation of ERK activity in KRAS(G12D)-induced AT1 cells increases transformation efficiency, proliferation and progression from lepidic to mixed tumour histology. Overall, we have identified a new cell of origin for lung adenocarcinoma, the AT1 cell, which recapitulates features of human lepidic cancer. In so doing, we also uncover a capacity for oncogenic KRAS to reprogram a differentiated and quiescent cell back into its parent stem cell en route to adenomatous transformation. Our work further reveals that irrespective of a given cancer's current molecular profile and driver oncogene, the cell of origin exerts a pervasive and perduring influence on its subsequent behaviour.

2023-08-02

Nanospray Desorption Electrospray Ionization (Nano-DESI) Mass Spectrometry Imaging with High Ion Mobility Resolution

Jiang LX, Hernly E, Hu H, Hilger RT, Neuweger H, Yang M, Laskin J

TTD-Purdue

Untargeted separation of isomeric and isobaric species in mass spectrometry imaging (MSI) is challenging. The combination of ion mobility spectrometry (IMS) with MSI has emerged as an effective strategy for differentiating isomeric and isobaric species, which substantially enhances the molecular coverage and specificity of MSI experiments. In this study, we have implemented nanospray desorption electrospray ionization (nano-DESI) MSI on a trapped ion mobility spectrometry (TIMS) mass spectrometer. A new nano-DESI source was constructed, and a specially designed inlet extension was fabricated to accommodate the new source. The nano-DESI-TIMS-MSI platform was evaluated by imaging mouse brain tissue sections. We achieved high ion mobility resolution by utilizing three narrow mobility scan windows that covered the majority of the lipid molecules. Notably, the mobility resolution reaching up to 300 in this study is much higher than the resolution obtained in our previous study using drift tube IMS. High-resolution TIMS successfully separated lipid isomers and isobars, revealing their distinct localizations in tissue samples. Our results further demonstrate the power of high-mobility-resolution IMS for unraveling the complexity of biomolecular mixtures analyzed in MSI experiments.

2023-08-15

Integrated single-cell chromatin and transcriptomic analyses of human scalp identify gene-regulatory programs and critical cell types for hair and skin diseases

Ober-Reynolds B, Wang C, Ko JM, Rios EJ, Aasi SZ, Davis MM, Oro AE, Greenleaf WJ

TMC-Stanford

Genome-wide association studies have identified many loci associated with hair and skin disease, but identification of causal variants requires deciphering of gene-regulatory networks in relevant cell types. We generated matched single-cell chromatin profiles and transcriptomes from scalp tissue from healthy controls and patients with alopecia areata, identifying diverse cell types of the hair follicle niche. By interrogating these datasets at multiple levels of cellular resolution, we infer 50-100% more enhancer-gene links than previous approaches and show that aggregate enhancer accessibility for highly regulated genes predicts expression. We use these gene-regulatory maps to prioritize cell types, genes and causal variants implicated in the pathobiology of androgenetic alopecia (AGA), eczema and other complex traits. AGA genome-wide association studies signals are enriched in dermal papilla regulatory regions, supporting the role of these cells as drivers of AGA pathogenesis. Finally, we train machine learning models to nominate single-nucleotide polymorphisms that affect gene expression through disruption of transcription factor binding, predicting candidate functional single-nucleotide polymorphism for AGA and eczema.

2023-08-15

Proteome Mapping of the Human Pancreatic Islet Microenvironment Reveals Endocrine-Exocrine Signaling Sphere of Influence

Gosline SJC, Veličković M, Pino JC, Day LZ, Attah IK, Swensen AC, Danna V, Posso C, Rodland KD, Chen J, Matthews CE, Campbell-Thompson M, Laskin J, Burnum-Johnson K, Zhu Y, Piehowski PD

TMC-PNNL

The need for a clinically accessible method with the ability to match protein activity within heterogeneous tissues is currently unmet by existing technologies. Our proteomics sample preparation platform, named microPOTS (Microdroplet Processing in One pot for Trace Samples), can be used to measure relative protein abundance in micron-scale samples alongside the spatial location of each measurement, thereby tying biologically interesting proteins and pathways to distinct regions. However, given the smaller pixel/voxel number and amount of tissue measured, standard mass spectrometric analysis pipelines have proven inadequate. Here we describe how existing computational approaches can be adapted to focus on the specific biological questions asked in spatial proteomics experiments. We apply this approach to present an unbiased characterization of the human islet microenvironment comprising the entire complex array of cell types involved while maintaining spatial information and the degree of the islet's sphere of influence. We identify specific functional activity unique to the pancreatic islet cells and demonstrate how far their signature can be detected in the adjacent tissue. Our results show that we can distinguish pancreatic islet cells from the neighboring exocrine tissue environment, recapitulate known biological functions of islet cells, and identify a spatial gradient in the expression of RNA processing proteins within the islet microenvironment.

2023-08-18

Systematic Sampling of the Female Reproductive System for Molecular Characterization

Fisher SA, Grijalva M, Guo R, Johnston SA, Laurent LC, Nguyen H, Renz J, Rosario JG, Rudich S, Gregory BD, Kim J, O'Neill K

TMC-UPenn

As part of the National Institutes of Health Human BioMolecular Atlas Program to develop a global platform to map the 37 trillion cells in the adult human body, we are generating a comprehensive molecular characterization of the female reproductive system. Data gathered from multiple single-cell/single-nucleus and spatial molecular assays will be used to build a 3D molecular atlas. Herein, we describe our multistep protocol, beginning with an optimized organ procurement workflow that maintains functional characteristics of the uterus, ovaries, and fallopian tubes by perfusing these organs with preservation solution. We have also developed a structured tissue sampling procedure that retains information on individual-level anatomic, physiologic, and individual diversity of the female reproductive system, toward full exploration of the function and structure of female reproductive cells. © 2023 Wiley Periodicals LLC. Basic Protocol 1: Preparation and preservation of the female reproductive system (ovaries, fallopian tubes, and uterus) prior to procurement Basic Protocol 2: Removal of the female reproductive system en bloc Basic Protocol 3: Postsurgical dissection of ovaries Basic Protocol 4: Postsurgical dissection of fallopian tubes Basic Protocol 5: Postsurgical dissection of cervix Basic Protocol 6: Postsurgical dissection of uterine body Support Protocol 1: OCT-embedded tissue protocol Support Protocol 2: Tissue fixation protocol Support Protocol 3: Snap-frozen tissue protocol Basic Protocol 7: Tissue slice preparation for Visium analysis Support Protocol 4: Hematoxylin and eosin staining for 10X Visium imaging Basic Protocol 8: Manual tissue dissociation for Multiome analysis Basic Protocol 9: Tissue dissociation for Multiome analysis using S2 Singulator.

2023-08-29

Proteome Landscapes of Human Hepatocellular Carcinoma and Intrahepatic Cholangiocarcinoma

Yi X, Zhu J, Liu W, Peng L, Lu C, Sun P, Huang L, Nie X, Huang S, Guo T, Zhu Y

TTD-Purdue

Liver cancer is among the top leading causes of cancer mortality worldwide. Particularly, hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (CCA) have been extensively investigated from the aspect of tumor biology. However, a comprehensive and systematic understanding of the molecular characteristics of HCC and CCA remains absent. Here, we characterized the proteome landscapes of HCC and CCA using the data-independent acquisition (DIA) mass spectrometry (MS) method. By comparing the quantitative proteomes of HCC and CCA, we found several differences between the two cancer types. In particular, we found an abnormal lipid metabolism in HCC and activated extracellular matrix-related pathways in CCA. We next developed a three-protein classifier to distinguish CCA from HCC, achieving an area under the curve (AUC) of 0.92, and an accuracy of 90% in an independent validation cohort of 51 patients. The distinct molecular characteristics of HCC and CCA presented in this study provide new insights into the tumor biology of these two major important primary liver cancers. Our findings may help develop more efficient diagnostic approaches and new targeted drug treatments.

2023-09-01

Matrisome AnalyzeR - a suite of tools to annotate and quantify ECM molecules in big datasets across organisms

Petrov PB, Considine JM, Izzi V, Naba A

DP-Illinois

The extracellular matrix (ECM) is a complex meshwork of proteins that forms the scaffold of all tissues in multicellular organisms. It plays crucial roles in all aspects of life - from orchestrating cell migration during development, to supporting tissue repair. It also plays critical roles in the etiology or progression of diseases. To study this compartment, we have previously defined the compendium of all genes encoding ECM and ECM-associated proteins for multiple organisms. We termed this compendium the 'matrisome' and further classified matrisome components into different structural or functional categories. This nomenclature is now largely adopted by the research community to annotate '-omics' datasets and has contributed to advance both fundamental and translational ECM research. Here, we report the development of Matrisome AnalyzeR, a suite of tools including a web-based application and an R package. The web application can be used by anyone interested in annotating, classifying and tabulating matrisome molecules in large datasets without requiring programming knowledge. The companion R package is available to more experienced users, interested in processing larger datasets or in additional data visualization options.

2023-09-06

Dynamic Glycoprotein Hyposialylation Promotes Chemotherapy Evasion and Metastatic Seeding of Quiescent Circulating Tumor Cell Clusters in Breast Cancer

Dashzeveg NK, Jia Y, Zhang Y, Gerratana L, Patel P, Shajahan A, Dandar T, Ramos EK, Almubarak HF, Adorno-Cruz V, Taftaf R, Schuster EJ, Scholten D, Sokolowski MT, Reduzzi C, El-Shennawy L, Hoffmann AD, Manai M, Zhang Q, D'Amico P, Azadi P, Colley KJ, Platanias LC, Shah AN, Gradishar WJ, Cristofanilli M, Muller WA, Cobb BA, Liu H.

TTD-PNNL/Northwestern

Most circulating tumor cells (CTC) are detected as single cells, whereas a small proportion of CTCs in multicellular clusters with stemness properties possess 20- to 100-times higher metastatic propensity than the single cells. Here we report that CTC dynamics in both singles and clusters in response to therapies predict overall survival for breast cancer. Chemotherapy-evasive CTC clusters are relatively quiescent with a specific loss of ST6GAL1-catalyzed α2,6-sialylation in glycoproteins. Dynamic hyposialylation in CTCs or deficiency of ST6GAL1 promotes cluster formation for metastatic seeding and enables cellular quiescence to evade paclitaxel treatment in breast cancer. Glycoproteomic analysis reveals newly identified protein substrates of ST6GAL1, such as adhesion or stemness markers PODXL, ICAM1, ECE1, ALCAM1, CD97, and CD44, contributing to CTC clustering (aggregation) and metastatic seeding. As a proof of concept, neutralizing antibodies against one newly identified contributor, PODXL, inhibit CTC cluster formation and lung metastasis associated with paclitaxel treatment for triple-negative breast cancer. Significance: This study discovers that dynamic loss of terminal sialylation in glycoproteins of CTC clusters contributes to the fate of cellular dormancy, advantageous evasion to chemotherapy, and enhanced metastatic seeding. It identifies PODXL as a glycoprotein substrate of ST6GAL1 and a candidate target to counter chemoevasion-associated metastasis of quiescent tumor cells. This article is featured in Selected Articles from This Issue, p. 1949.

2023-09-08

Segmentation quality assessment by automated detection of erroneous surface regions in medical images

Zaman FA, Zhang L, Zhang H, Sonka M, Wu X

TMC-CHOP

Despite the advancement in deep learning-based semantic segmentation methods, which have achieved accuracy levels of field experts in many computer vision applications, the same general approaches may frequently fail in 3D medical image segmentation due to complex tissue structures, noisy acquisition, disease-related pathologies, as well as the lack of sufficiently large datasets with associated annotations. For expeditious diagnosis and quantitative image analysis in large-scale clinical trials, there is a compelling need to predict segmentation quality without ground truth. In this paper, we propose a deep learning framework to locate erroneous regions on the boundary surfaces of segmented objects for quality control and assessment of segmentation. A Convolutional Neural Network (CNN) is explored to learn the boundary related image features of multi-objects that can be used to identify location-specific inaccurate segmentation. The predicted error locations can facilitate efficient user interaction for interactive image segmentation (IIS). We evaluated the proposed method on two data sets: Osteoarthritis Initiative (OAI) 3D knee MRI and 3D calf muscle MRI. The average sensitivity scores of 0.95 and 0.92, and the average positive predictive values of 0.78 and 0.91 were achieved, respectively, for erroneous surface region detection of knee cartilage segmentation and calf muscle segmentation. Our experiment demonstrated promising performance of the proposed method for segmentation quality assessment by automated detection of erroneous surface regions in medical images.

2023-09-11

OME-Zarr: a cloud-optimized bioimaging file format with international community support

Moore J, Basurto-Lozada D, Besson S, Bogovic J, Bragantini J, Brown EM, Burel JM, Casas Moreno X, de Medeiros G, Diel EE, Gault D, Ghosh SS, Gold I, Halchenko YO, Hartley M, Horsfall D, Keller MS, Kittisopikul M, Kovacs G, Küpcü Yoldaş A, Kyoda K, le Tournoulx de la Villegeorges A, Li T, Liberali P, Lindner D, Linkert M, Lüthi J, Maitin-Shepard J, Manz T, Marconato L, McCormick M, Lange M, Mohamed K, Moore W, Norlin N, Ouyang W, Özdemir B, Palla G, Pape C, Pelkmans L, Pietzsch T, Preibisch S, Prete M, Rzepka N, Samee S, Schaub N, Sidky H, Solak AC, Stirling DR, Striebel J, Tischer C, Toloudis D, Virshup I, Walczysko P, Watson AM, Weisbart E, Wong F, Yamauchi KA, Bayraktar O, Cimini BA, Gehlenborg N, Haniffa M, Hotaling N, Onami S, Royer LA, Saalfeld S, Stegle O, Theis FJ, Swedlow JR

HIVE TC-Harvard

A growing community is constructing a next-generation file format (NGFF) for bioimaging to overcome problems of scalability and heterogeneity. Organized by the Open Microscopy Environment (OME), individuals and institutes across diverse modalities facing these problems have designed a format specification process (OME-NGFF) to address these needs. This paper brings together a wide range of those community members to describe the cloud-optimized format itself-OME-Zarr-along with tools and data resources available today to increase FAIR access and remove barriers in the scientific process. The current momentum offers an opportunity to unify a key component of the bioimaging domain-the file format that underlies so many personal, institutional, and global data management and analysis tasks.

2023-09-11

Transcriptomic profiling of tissue environments critical for post-embryonic patterning and morphogenesis of zebrafish skin

Aman AJ, Saunders LM, Carr AA, Srivatasan S, Eberhard C, Carrington B, Watkins-Chow D, Pavan WJ, Trapnell C, Parichy DM

TMC-Cal Tech

Pigment patterns and skin appendages are prominent features of vertebrate skin. In zebrafish, regularly patterned pigment stripes and an array of calcified scales form simultaneously in the skin during post-embryonic development. Understanding the mechanisms that regulate stripe patterning and scale morphogenesis may lead to the discovery of fundamental mechanisms that govern the development of animal form. To learn about cell types and signaling interactions that govern skin patterning and morphogenesis, we generated and analyzed single-cell transcriptomes of skin from wild-type fish as well as fish having genetic or transgenically induced defects in squamation or pigmentation. These data reveal a previously undescribed population of epidermal cells that express transcripts encoding enamel matrix proteins, suggest hormonal control of epithelial-mesenchymal signaling, clarify the signaling network that governs scale papillae development, and identify a critical role for the hypodermis in supporting pigment cell development. Additionally, these comprehensive single-cell transcriptomic data representing skin phenotypes of biomedical relevance should provide a useful resource for accelerating the discovery of mechanisms that govern skin development and homeostasis.

2023-09-11

Semantic-Aware Contrastive Learning for Multi-Object Medical Image Segmentation

Lee HH, Tang Y, Yang Q, Yu X, Cai LY, Remedios LW, Bao S, Landman BA, Huo Y.

TMC-Vanderbilt (Kidney)

Medical image segmentation, or computing voxel-wise semantic masks, is a fundamental yet challenging task in medical imaging domain. To increase the ability of encoder-decoder neural networks to perform this task across large clinical cohorts, contrastive learning provides an opportunity to stabilize model initialization and enhances downstream tasks performance without ground-truth voxel-wise labels. However, multiple target objects with different semantic meanings and contrast level may exist in a single image, which poses a problem for adapting traditional contrastive learning methods from prevalent "image-level classification" to "pixel-level segmentation". In this article, we propose a simple semantic-aware contrastive learning approach leveraging attention masks and image-wise labels to advance multi-object semantic segmentation. Briefly, we embed different semantic objects to different clusters rather than the traditional image-level embeddings. We evaluate our proposed method on a multi-organ medical image segmentation task with both in-house data and MICCAI Challenge 2015 BTCV datasets. Compared with current state-of-the-art training strategies, our proposed pipeline yields a substantial improvement of 5.53% and 6.09% on Dice score for both medical image segmentation cohorts respectively (p-value 0.01). The performance of the proposed method is further assessed on external medical image cohort via MICCAI Challenge FLARE 2021 dataset, and achieves a substantial improvement from Dice 0.922 to 0.933 (p-value 0.01).

2023-09-14

Scalable Nanopore sequencing of human genomes provides a comprehensive view of haplotype-resolved variation and methylation

Kolmogorov M, Billingsley KJ, Mastoras M, Meredith M, Monlong J, Lorig-Roach R, Asri M, Alvarez Jerez P, Malik L, Dewan R, Reed X, Genner RM, Daida K, Behera S, Shafin K, Pesout T, Prabakaran J, Carnevali P, Yang J, Rhie A, Scholz SW, Traynor BJ, Miga KH, Jain M, Timp W, Phillippy AM, Chaisson M, Sedlazeck FJ, Blauwendraat C, Paten B

HIVE TC-CMU

Long-read sequencing technologies substantially overcome the limitations of short-reads but have not been considered as a feasible replacement for population-scale projects, being a combination of too expensive, not scalable enough or too error-prone. Here we develop an efficient and scalable wet lab and computational protocol, Napu, for Oxford Nanopore Technologies long-read sequencing that seeks to address those limitations. We applied our protocol to cell lines and brain tissue samples as part of a pilot project for the National Institutes of Health Center for Alzheimer's and Related Dementias. Using a single PromethION flow cell, we can detect single nucleotide polymorphisms with F1-score comparable to Illumina short-read sequencing. Small indel calling remains difficult within homopolymers and tandem repeats, but achieves good concordance to Illumina indel calls elsewhere. Further, we can discover structural variants with F1-score on par with state-of-the-art de novo assembly methods. Our protocol phases small and structural variants at megabase scales and produces highly accurate, haplotype-specific methylation calls.

2023-09-15

Early cancer detection by SERS spectroscopy and machine learning

Shi L, Li Y, Li Z

TMC-URMC

A new approach for early detection of multiple cancers is presented by integrating SERS spectroscopy of serum molecular fingerprints and machine learning.

2023-09-15

Rapid Setup of Tissue Microarray and Tiled Area Imaging on the Multiplexed Ion Beam Imaging Microscope using the Tile/SED/Array Interface

Piyadasa H, Oberlton B, Kong A, Camacho Fullaway C, Reddy Varra S, Sowers C, Tsai AG

TMC-Stanford (Bone marrow)

Multiplexed ion beam imaging (MIBI) is a next-generation mass spectrometry-based microscopy technique that generates 40+ plex images of protein expression in histologic tissues, enabling detailed dissection of cellular phenotypes and histoarchitectural organization. A key bottleneck in operation occurs when users select the physical locations on the tissue for imaging. As the scale and complexity of MIBI experiments have increased, the manufacturer-provided interface and third-party tools have become increasingly unwieldy for imaging large tissue microarrays and tiled tissue areas. Thus, a web-based, interactive, what-you-see-is-what-you-get (WYSIWYG) graphical interface layer - the tile/SED/array Interface (TSAI) - was developed for users to set imaging locations using familiar and intuitive mouse gestures such as drag-and-drop, click-and-drag, and polygon drawing. Written according to web standards already built into modern web browsers, it requires no installation of external programs, extensions, or compilers. Of interest to the hundreds of current MIBI users, this interface dramatically simplifies and accelerates the setup of large, complex MIBI runs.

2023-09-19

Surgical management of lymphedema: Does a microsurgeon's bias exist?

Friedman R, Ismail Aly ME, Singhal D

TMC-BIDMC

N/A

2023-09-19

Prospective on Imaging Mass Spectrometry in Clinical Diagnostics

Moore JL, Patterson NH, Norris JL, Caprioli RM

TMC-Vanderbilt (Kidney)

Imaging mass spectrometry (IMS) is a molecular technology utilized for spatially driven research, providing molecular maps from tissue sections. This article reviews matrix-assisted laser desorption ionization (MALDI) IMS and its progress as a primary tool in the clinical laboratory. MALDI mass spectrometry has been used to classify bacteria and perform other bulk analyses for plate-based assays for many years. However, the clinical application of spatial data within a tissue biopsy for diagnoses and prognoses is still an emerging opportunity in molecular diagnostics. This work considers spatially driven mass spectrometry approaches for clinical diagnostics and addresses aspects of new imaging-based assays that include analyte selection, quality control/assurance metrics, data reproducibility, data classification, and data scoring. It is necessary to implement these tasks for the rigorous translation of IMS to the clinical laboratory; however, this requires detailed standardized protocols for introducing IMS into the clinical laboratory to deliver reliable and reproducible results that inform and guide patient care.

2023-09-21

Multimodal single-cell datasets characterize antigen-specific CD8+ T cells across SARS-CoV-2 vaccination and infection

Zhang B, Upadhyay R, Hao Y, Samanovic MI, Herati RS, Blair JD, Axelrad J, Mulligan MJ, Littman DR, Satija R

HIVE MC-NYGC

The immune response to SARS-CoV-2 antigen after infection or vaccination is defined by the durable production of antibodies and T cells. Population-based monitoring typically focuses on antibody titer, but there is a need for improved characterization and quantification of T cell responses. Here, we used multimodal sequencing technologies to perform a longitudinal analysis of circulating human leukocytes collected before and after immunization with the mRNA vaccine BNT162b2. Our data indicated distinct subpopulations of CD8⁺ T cells, which reliably appeared 28 days after prime vaccination. Using a suite of cross-modality integration tools, we defined their transcriptome, accessible chromatin landscape and immunophenotype, and we identified unique biomarkers within each modality. We further showed that this vaccine-induced population was SARS-CoV-2 antigen-specific and capable of rapid clonal expansion. Moreover, we identified these CD8⁺ T cell populations in scRNA-seq datasets from COVID-19 patients and found that their relative frequency and differentiation outcomes were predictive of subsequent clinical outcomes.

2023-09-28

Navigating the kidney organoid: insights into assessment and enhancement of nephron function

Tabibzadeh N, Satlin LM, Jain S, Morizane R

TMC-WUSTL

Kidney organoids are three-dimensional structures generated from pluripotent stem cells (PSCs) that are capable of recapitulating the major structures of mammalian kidneys. As this technology is expected to be a promising tool for studying renal biology, drug discovery, and regenerative medicine, the functional capacity of kidney organoids has emerged as a critical question in the field. Kidney organoids produced using several protocols harbor key structures of native kidneys. Here we review the current state, recent advances, and future challenges in the functional characterization of kidney organoids, strategies to accelerate and enhance kidney organoid functions, and access to PSC resources to advance organoid research. The strategies to construct physiologically relevant kidney organoids include the use of organ-on-a-chip technologies that integrate fluid circulation and improve organoid maturation. These approaches result in increased expression of the major tubular transporters and elements of mechanosensory signaling pathways suggestive of improved functionality. Nevertheless, continuous efforts remain crucial to create kidney tissue that more faithfully replicates physiological conditions for future applications in kidney regeneration medicine and their ethical use in patient care.

2023-10-02

The technological landscape and applications of single-cell multi-omics

Baysoy A, Bai Z, Satija R, Fan R

TTD-Yale

Single-cell multi-omics technologies and methods characterize cell states and activities by simultaneously integrating various single-modality omics methods that profile the transcriptome, genome, epigenome, epitranscriptome, proteome, metabolome and other (emerging) omics. Collectively, these methods are revolutionizing molecular cell biology research. In this comprehensive Review, we discuss established multi-omics technologies as well as cutting-edge and state-of-the-art methods in the field. We discuss how multi-omics technologies have been adapted and improved over the past decade using a framework characterized by optimization of throughput and resolution, modality integration, uniqueness and accuracy, and we also discuss multi-omics limitations. We highlight the impact that single-cell multi-omics technologies have had in cell lineage tracing, tissue-specific and cell-specific atlas production, tumour immunology and cancer genetics, and in mapping of cellular spatial information in fundamental and translational research. Finally, we discuss bioinformatics tools that have been developed to link different omics modalities and elucidate functionality through the use of better mathematical modelling and computational methods.

2023-10-09

High-plex protein and whole transcriptome co-mapping at cellular resolution with spatial CITE-seq

Liu Y, DiStasio M, Su G, Asashima H, Enninful A, Qin X, Deng Y, Nam J, Gao F, Bordignon P, Cassano M, Tomayko M, Xu M, Halene S, Craft JE, Hafler D, Fan R

TTD-Yale

In this study, we extended co-indexing of transcriptomes and epitopes (CITE) to the spatial dimension and demonstrated high-plex protein and whole transcriptome co-mapping. We profiled 189 proteins and whole transcriptome in multiple mouse tissue types with spatial CITE sequencing and then further applied the method to measure 273 proteins and transcriptome in human tissues, revealing spatially distinct germinal center reactions in tonsil and early immune activation in skin at the Coronavirus Disease 2019 mRNA vaccine injection site.

2023-10-14

Potential and risks of artificial intelligence models: Common in medicine practice and special in pediatric urology

Wen Y, Di H

TMC-Stanford

2023-10-23

High-resolution integrated microfluidic probe for mass spectrometry imaging of biological tissues

Li X, Hu H, Laskin J

TTD-Purdue

Nanospray desorption electrospray ionization (nano-DESI) is an ambient ionization technique that enables molecular imaging of biological samples with high spatial resolution. We have recently developed an integrated microfluidic probe (iMFP) for nano-DESI mass spectrometry imaging (MSI) that significantly enhances the robustness of the technique. In this study, we designed a new probe that enables imaging of biological samples with high spatial resolution. The new probe design features smaller primary and spray channels and an entirely new configuration of the sampling port that enables robust imaging of tissues with a spatial resolution of 8-10 μm. We demonstrate the spatial resolution, sensitivity, durability, and throughput of the iMFP by imaging mouse uterine and brain tissue sections. The robustness of the high-resolution iMFP allowed us to perform first imaging experiments with both high spatial resolution and high throughput, which is particularly advantageous for high-resolution imaging of large tissue sections of interest to most MSI applications. Overall, the new probe design opens opportunities for mapping of biomolecules in biological samples with high throughput and cellular resolution, which is important for understanding biological systems.

2023-11-14

Dimension-agnostic and granularity-based spatially variable gene identification using BSP

Wang J, Li J, Kramer ST, Su L, Chang Y, Xu C, Eadon MT, Kiryluk K, Ma Q, Xu D

TMC-WUSTL

Identifying spatially variable genes (SVGs) is critical in linking molecular cell functions with tissue phenotypes. Spatially resolved transcriptomics captures cellular-level gene expression with corresponding spatial coordinates in two or three dimensions and can be used to infer SVGs effectively. However, current computational methods may not achieve reliable results and often cannot handle three-dimensional spatial transcriptomic data. Here we introduce BSP (big-small patch), a non-parametric model by comparing gene expression pattens at two spatial granularities to identify SVGs from two or three-dimensional spatial transcriptomics data in a fast and robust manner. This method has been extensively tested in simulations, demonstrating superior accuracy, robustness, and high efficiency. BSP is further validated by substantiated biological discoveries in cancer, neural science, rheumatoid arthritis, and kidney studies with various types of spatial transcriptomics technologies.

2023-12-04

Evidence for lung barrier regeneration by differentiation prior to binucleated and stem cell division

Guild J, Juul NH, Andalon A, Taenaka H, Coffey RJ, Matthay MA, Desai TJ

TTD-Stanford

With each breath, oxygen diffuses across remarkably thin alveolar type I (AT1) cells into underlying capillaries. Interspersed cuboidal AT2 cells produce surfactant and act as stem cells. Even transient disruption of this delicate barrier can promote capillary leak. Here, we selectively ablated AT1 cells, which uncovered rapid AT2 cell flattening with near-continuous barrier preservation, culminating in AT1 differentiation. Proliferation subsequently restored depleted AT2 cells in two phases, mitosis of binucleated AT2 cells followed by replication of mononucleated AT2 cells. M phase entry of binucleated and S phase entry of mononucleated cells were both triggered by AT1-produced hbEGF signaling via EGFR to Wnt-active AT2 cells. Repeated AT1 cell killing elicited exuberant AT2 proliferation, generating aberrant daughter cells that ceased surfactant function yet failed to achieve AT1 differentiation. This hyperplasia eventually resolved, yielding normal-appearing alveoli. Overall, this specialized regenerative program confers a delicate simple epithelium with functional resiliency on par with the physical durability of thicker, pseudostratified, or stratified epithelia.

2023-12-12

Advances in Imaging Mass Spectrometry for Biomedical and Clinical Research

Djambazova KV, van Ardenne JM, Spraggins JM

TMC-Vanderbilt (Kidney)

Imaging mass spectrometry (IMS) allows for the untargeted mapping of biomolecules directly from tissue sections. This technology is increasingly integrated into biomedical and clinical research environments to supplement traditional microscopy and provide molecular context for tissue imaging. IMS has widespread clinical applicability in the fields of oncology, dermatology, microbiology, and others. This review summarizes the two most widely employed IMS technologies, matrix-assisted laser desorption/ionization (MALDI) and desorption electrospray ionization (DESI), and covers technological advancements, including efforts to increase spatial resolution, specificity, and throughput. We also highlight recent biomedical applications of IMS, primarily focusing on disease diagnosis, classification, and subtyping.

2023-12-15

Spatial pharmacology using mass spectrometry imaging

Rajbhandari P, Neelakantan TV, Hosny N, Stockwell BR

TTD-Columbia/Penn State

The emerging and powerful field of spatial pharmacology can map the spatial distribution of drugs and their metabolites, as well as their effects on endogenous biomolecules including metabolites, lipids, proteins, peptides, and glycans, without the need for labeling. This is enabled by mass spectrometry imaging (MSI) that provides previously inaccessible information in diverse phases of drug discovery and development. We provide a perspective on how MSI technologies and computational tools can be implemented to reveal quantitative spatial drug pharmacokinetics and toxicology, tissue subtyping, and associated biomarkers. We also highlight the emerging potential of comprehensive spatial pharmacology through integration of multimodal MSI data with other spatial technologies. Finally, we describe how to overcome challenges including improving reproducibility and compound annotation to generate robust conclusions that will improve drug discovery and development processes.

2024-01-09

Slide-tags enables single-nucleus barcoding for multimodal spatial genomics

Russell AJC, Weir JA, Nadaf NM, Shabet M, Kumar V, Kambhampati S, Raichur R, Marrero GJ, Liu S, Balderrama KS, Vanderburg CR, Shanmugam V, Tian L, Iorgulescu JB, Yoon CH, Wu CJ, Macosko EZ, Chen F

RTI-Broad

Recent technological innovations have enabled the high-throughput quantification of gene expression and epigenetic regulation within individual cells, transforming our understanding of how complex tissues are constructed^1-6. However, missing from these measurements is the ability to routinely and easily spatially localize these profiled cells. We developed a strategy, Slide-tags, in which single nuclei within an intact tissue section are tagged with spatial barcode oligonucleotides derived from DNA-barcoded beads with known positions. These tagged nuclei can then be used as an input into a wide variety of single-nucleus profiling assays. Application of Slide-tags to the mouse hippocampus positioned nuclei at less than 10 μm spatial resolution and delivered whole-transcriptome data that are indistinguishable in quality from ordinary single-nucleus RNA-sequencing data. To demonstrate that Slide-tags can be applied to a wide variety of human tissues, we performed the assay on brain, tonsil and melanoma. We revealed cell-type-specific spatially varying gene expression across cortical layers and spatially contextualized receptor-ligand interactions driving B cell maturation in lymphoid tissue. A major benefit of Slide-tags is that it is easily adaptable to almost any single-cell measurement technology. As a proof of principle, we performed multiomic measurements of open chromatin, RNA and T cell receptor (TCR) sequences in the same cells from metastatic melanoma, identifying transcription factor motifs driving cancer cell state transitions in spatially distinct microenvironments. Slide-tags offers a universal platform for importing the compendium of established single-cell measurements into the spatial genomics repertoire.

2024-01-10

The chromatin landscape of healthy and injured cell types in the human kidney

Gisch DL, Brennan M, Lake BB, Basta J, Keller MS, Melo Ferreira R, Akilesh S, Ghag R, Lu C, Cheng YH, Collins KS, Parikh SV, Rovin BH, Robbins L, Stout L, Conklin KY, Diep D, Zhang B, Knoten A, Barwinska D, Asghari M, Sabo AR, Ferkowicz MJ, Sutton TA, Kelly KJ, De Boer IH, Rosas SE, Kiryluk K, Hodgin JB, Alakwaa F, Winfree S, Jefferson N, Türkmen A, Gaut JP, Gehlenborg N, Phillips CL, El-Achkar TM, Dagher PC, Hato T, Zhang K, Himmelfarb J, Kretzler M, Mollah S; Kidney Precision Medicine Project (KPMP); Jain S, Rauchman M, Eadon MT

TMC-WUSTL

There is a need to define regions of gene activation or repression that control human kidney cells in states of health, injury, and repair to understand the molecular pathogenesis of kidney disease and design therapeutic strategies. Comprehensive integration of gene expression with epigenetic features that define regulatory elements remains a significant challenge. We measure dual single nucleus RNA expression and chromatin accessibility, DNA methylation, and H3K27ac, H3K4me1, H3K4me3, and H3K27me3 histone modifications to decipher the chromatin landscape and gene regulation of the kidney in reference and adaptive injury states. We establish a spatially-anchored epigenomic atlas to define the kidney's active, silent, and regulatory accessible chromatin regions across the genome. Using this atlas, we note distinct control of adaptive injury in different epithelial cell types. A proximal tubule cell transcription factor network of ELF3, KLF6, and KLF10 regulates the transition between health and injury, while in thick ascending limb cells this transition is regulated by NR2F1. Further, combined perturbation of ELF3, KLF6, and KLF10 distinguishes two adaptive proximal tubular cell subtypes, one of which manifested a repair trajectory after knockout. This atlas will serve as a foundation to facilitate targeted cell-specific therapeutics by reprogramming gene regulatory networks.

2024-01-25

Gene panel selection for targeted spatial transcriptomics.

Zhang Y, Petukhov V, Biederstedt E, Que R, Zhang K, Kharchenko PV

TMC-UCSD

Targeted spatial transcriptomics hold particular promise in analyzing complex tissues. Most such methods, however, measure only a limited panel of transcripts, which need to be selected in advance to inform on the cell types or processes being studied. A limitation of existing gene selection methods is their reliance on scRNA-seq data, ignoring platform effects between technologies. Here we describe gpsFISH, a computational method performing gene selection through optimizing detection of known cell types. By modeling and adjusting for platform effects, gpsFISH outperforms other methods. Furthermore, gpsFISH can incorporate cell type hierarchies and custom gene preferences to accommodate diverse design requirements.

2024-02-01

Expanding the coverage of spatial proteomics: a machine learning approach

Sun H, Li J, Murphy RF

HIVE TC-CMU

Motivation: Multiplexed protein imaging methods use a chosen set of markers and provide valuable information about complex tissue structure and cellular heterogeneity. However, the number of markers that can be measured in the same tissue sample is inherently limited. Results: In this paper, we present an efficient method to choose a minimal predictive subset of markers that for the first time allows the prediction of full images for a much larger set of markers. We demonstrate that our approach also outperforms previous methods for predicting cell-level protein composition. Most importantly, we demonstrate that our approach can be used to select a marker set that enables prediction of a much larger set than could be measured concurrently. Availability and implementation: All code and intermediate results are available in a Reproducible Research Archive at https://github.com/murphygroup/CODEXPanelOptimization.

2024-02-02

Parsing 20 Years of Public Data by AI Maps Trends in Proteomics and Forecasts Technology

Green JJ, Grimm C, Fristo A, Byrum J, Kelleher NL

RTI-Northwestern

The trends of the last 20 years in biotechnology were revealed using artificial intelligence and natural language processing (NLP) of publicly available data. Implementing this "science-of-science" approach, we capture convergent trends in the field of proteomics in both technology development and application across the phylogenetic tree of life. With major gaps in our knowledge about protein composition, structure, and location over time, we report trends in persistent, popular approaches and emerging technologies across 94 ideas from a corpus of 29 journals in PubMed over two decades. New metrics for clusters of these ideas reveal the progression and popularity of emerging approaches like single-cell, spatial, compositional, and chemical proteomics designed to better capture protein-level chemistry and biology. This analysis of the proteomics literature with advanced analytic tools quantifies the Rate of Rise for a next generation of technologies to better define, quantify, and visualize the multiple dimensions of the proteome that will transform our ability to measure and understand proteins in the coming decade.

2024-02-13

Modelling post-implantation human development to yolk sac blood emergence

Hislop J, Song Q, Keshavarz F K, Alavi A, Schoenberger R, LeGraw R, Velazquez JJ, Mokhtari T, Taheri MN, Rytel M, Chuva de Sousa Lopes SM, Watkins S, Stolz D, Kiani S, Sozen B, Bar-Joseph Z, Ebrahimkhani MR

HIVE TC-CMU

Implantation of the human embryo begins a critical developmental stage that comprises profound events including axis formation, gastrulation and the emergence of haematopoietic system^1,2. Our mechanistic knowledge of this window of human life remains limited due to restricted access to in vivo samples for both technical and ethical reasons^3-5. Stem cell models of human embryo have emerged to help unlock the mysteries of this stage^6-16. Here we present a genetically inducible stem cell-derived embryoid model of early post-implantation human embryogenesis that captures the reciprocal codevelopment of embryonic tissue and the extra-embryonic endoderm and mesoderm niche with early haematopoiesis. This model is produced from induced pluripotent stem cells and shows unanticipated self-organizing cellular programmes similar to those that occur in embryogenesis, including the formation of amniotic cavity and bilaminar disc morphologies as well as the generation of an anterior hypoblast pole and posterior domain. The extra-embryonic layer in these embryoids lacks trophoblast and shows advanced multilineage yolk sac tissue-like morphogenesis that harbours a process similar to distinct waves of haematopoiesis, including the emergence of erythroid-, megakaryocyte-, myeloid- and lymphoid-like cells. This model presents an easy-to-use, high-throughput, reproducible and scalable platform to probe multifaceted aspects of human development and blood formation at the early post-implantation stage. It will provide a tractable human-based model for drug testing and disease modelling.

2024-02-20

Stabilized mosaic single-cell data integration using unshared features

Ghazanfar S, Guibentif C, Marioni JC

HIVE MC-NYGC

Currently available single-cell omics technologies capture many unique features with different biological information content. Data integration aims to place cells, captured with different technologies, onto a common embedding to facilitate downstream analytical tasks. Current horizontal data integration techniques use a set of common features, thereby ignoring non-overlapping features and losing information. Here we introduce StabMap, a mosaic data integration technique that stabilizes mapping of single-cell data by exploiting the non-overlapping features. StabMap first infers a mosaic data topology based on shared features, then projects all cells onto supervised or unsupervised reference coordinates by traversing shortest paths along the topology. We show that StabMap performs well in various simulation contexts, facilitates 'multi-hop' mosaic data integration where some datasets do not share any features and enables the use of spatial gene expression features for mapping dissociated single-cell data onto a spatial transcriptomic reference.

2024-02-20

Thermal-plex: fluidic-free, rapid sequential multiplexed imaging with DNA-encoded thermal channels

Hong F, Kishi JY, Delgado RN, Jeong J, Saka SK, Su H, Cepko CL, Yin P

TTD-Harvard

Multiplexed fluorescence imaging is typically limited to three- to five-plex on standard setups. Sequential imaging methods based on iterative labeling and imaging enable practical higher multiplexing, but generally require a complex fluidic setup with several rounds of slow buffer exchange (tens of minutes to an hour for each exchange step). We report the thermal-plex method, which removes complex and slow buffer exchange steps and provides fluidic-free, rapid sequential imaging. Thermal-plex uses simple DNA probes that are engineered to fluoresce sequentially when, and only when, activated with transient exposure to heating spikes at designated temperatures (thermal channels). Channel switching is fast (<30 s) and is achieved with a commercially available and affordable on-scope heating device. We demonstrate 15-plex RNA imaging (five thermal × three fluorescence channels) in fixed cells and retina tissues in less than 4 min, without using buffer exchange or fluidics. Thermal-plex introduces a new labeling method for efficient sequential multiplexed imaging.

2024-02-20

Unsupervised and supervised discovery of tissue cellular neighborhoods from cell phenotypes

Hu Y, Rong J, Xu Y, Xie R, Peng J, Gao L, Tan K

TMC-CHOP

It is poorly understood how different cells in a tissue organize themselves to support tissue functions. We describe the CytoCommunity algorithm for the identification of tissue cellular neighborhoods (TCNs) based on cell phenotypes and their spatial distributions. CytoCommunity learns a mapping directly from the cell phenotype space to the TCN space using a graph neural network model without intermediate clustering of cell embeddings. By leveraging graph pooling, CytoCommunity enables de novo identification of condition-specific and predictive TCNs under the supervision of sample labels. Using several types of spatial omics data, we demonstrate that CytoCommunity can identify TCNs of variable sizes with substantial improvement over existing methods. By analyzing risk-stratified colorectal and breast cancer data, CytoCommunity revealed new granulocyte-enriched and cancer-associated fibroblast-enriched TCNs specific to high-risk tumors and altered interactions between neoplastic and immune or stromal cells within and between TCNs. CytoCommunity can perform unsupervised and supervised analyses of spatial omics maps and enable the discovery of condition-specific cell-cell communication patterns across spatial scales.

2024-02-21

Multi-molecular hyperspectral PRM-SRS microscopy

Zhang W, Li Y, Fung AA, Li Z, Jang H, Zha H, Chen X, Gao F, Wu JY, Sheng H, Yao J, Skowronska-Krawczyk D, Jain S, Shi L.

TMC-WUSTL

Lipids play crucial roles in many biological processes. Mapping spatial distributions and examining the metabolic dynamics of different lipid subtypes in cells and tissues are critical to better understanding their roles in aging and diseases. Commonly used imaging methods (such as mass spectrometry-based, fluorescence labeling, conventional optical imaging) can disrupt the native environment of cells/tissues, have limited spatial or spectral resolution, or cannot distinguish different lipid subtypes. Here we present a hyperspectral imaging platform that integrates a Penalized Reference Matching algorithm with Stimulated Raman Scattering (PRM-SRS) microscopy. Using this platform, we visualize and identify high density lipoprotein particles in human kidney, a high cholesterol to phosphatidylethanolamine ratio inside granule cells of mouse hippocampus, and subcellular distributions of sphingosine and cardiolipin in human brain. Our PRM-SRS displays unique advantages of enhanced chemical specificity, subcellular resolution, and fast data processing in distinguishing lipid subtypes in different organs and species.

2024-03-04

Proteome-scale tissue mapping using mass spectrometry based on label-free and multiplexed workflows

Kwon Y, Woo J, Yu F, Williams SM, Markillie LM, Moore RJ, Nakayasu ES, Chen J, Campbell-Thompson M, Mathews CE, Nesvizhskii AI, Qia WJ, Zhu Y

TMC-PNNL

Multiplexed bimolecular profiling of tissue microenvironment, or spatial omics, can provide deep insight into cellular compositions and interactions in both normal and diseased tissues. Proteome-scale tissue mapping, which aims to unbiasedly visualize all the proteins in whole tissue section or region of interest, has attracted significant interest because it holds great potential to directly reveal diagnostic biomarkers and therapeutic targets. While many approaches are available, however, proteome mapping still exhibits significant technical challenges in both protein coverage and analytical throughput. Since many of these existing challenges are associated with mass spectrometry-based protein identification and quantification, we performed a detailed benchmarking study of three protein quantification methods for spatial proteome mapping, including label-free, TMT-MS2, and TMT-MS3. Our study indicates label-free method provided the deepest coverages of ~3500 proteins at a spatial resolution of 50 μm and the largest quantification dynamic range, while TMT-MS2 method holds great benefit in mapping throughput at >125 pixels per day. The evaluation also indicates both label-free and TMT-MS2 provide robust protein quantifications in terms of identifying differentially abundant proteins and spatially co-variable clusters. In the study of pancreatic islet microenvironment, we demonstrated deep proteome mapping not only enables to identify protein markers specific to different cell types, but more importantly, it also reveals unknown or hidden protein patterns by spatial co-expression analysis.

2024-03-05

A Panoramic View of Cell Population Dynamics in Mammalian Aging

Zhang Z, Schaefer C, Jiang W, Lu Z, Lee J, Sziraki A, Abdulraouf A, Wick B, Haeussler M, Li Z, Molla G, Satija R, Zhou W, Cao J

HIVE MC-NYGC

To elucidate the aging-associated cellular population dynamics throughout the body, here we present PanSci, a single-cell transcriptome atlas profiling over 20 million cells from 623 mouse tissue samples, encompassing a range of organs across different life stages, sexes, and genotypes. This comprehensive dataset allowed us to identify more than 3,000 unique cellular states and catalog over 200 distinct aging-associated cell populations experiencing significant depletion or expansion. Our panoramic analysis uncovered temporally structured, organ- and lineage-specific shifts of cellular dynamics during lifespan progression. Moreover, we investigated aging-associated alterations in immune cell populations, revealing both widespread shifts and organ-specific changes. We further explored the regulatory roles of the immune system on aging and pinpointed specific age-related cell population expansions that are lymphocyte-dependent. The breadth and depth of our 'cell-omics' methodology not only enhance our comprehension of cellular aging but also lay the groundwork for exploring the complex regulatory networks among varied cell types in the context of aging and aging-associated diseases.

2024-03-08

Predicting drug outcome of population via clinical knowledge graph

Brbić M, Yasunaga M, Agarwal P, Leskovec J

TMC-Stanford

Optimal treatments depend on numerous factors such as drug chemical properties, disease biology, and patient characteristics to which the treatment is applied. To realize the promise of AI in healthcare, there is a need for designing systems that can capture patient heterogeneity and relevant biomedical knowledge. Here we present PlaNet, a geometric deep learning framework that reasons over population variability, disease biology, and drug chemistry by representing knowledge in the form of a massive clinical knowledge graph that can be enhanced by language models. Our framework is applicable to any sub-population, any drug as well drug combinations, any disease, and to a wide range of pharmacological tasks. We apply the PlaNet framework to reason about outcomes of clinical trials: PlaNet predicts drug efficacy and adverse events, even for experimental drugs and their combinations that have never been seen by the model. Furthermore, PlaNet can estimate the effect of changing population on the trial outcome with direct implications on patient stratification in clinical trials. PlaNet takes fundamental steps towards AI-guided clinical trials design, offering valuable guidance for realizing the vision of precision medicine using AI.

2024-03-20

Mapping human tissues with highly multiplexed RNA in situ hybridization

Kalhor K, Chen CJ, Lee HS, Cai M, Nafisi M, Que R, Palmer CR, Yuan Y, Zhang Y, Li X, Song J, Knoten A, Lake BB, Gaut JP, Keene CD, Lein E, Kharchenko PV, Chun J, Jain S, Fan JB, Zhang K

TMC-UCSD

In situ transcriptomic techniques promise a holistic view of tissue organization and cell-cell interactions. There has been a surge of multiplexed RNA in situ mapping techniques but their application to human tissues has been limited due to their large size, general lower tissue quality and high autofluorescence. Here we report DART-FISH, a padlock probe-based technology capable of profiling hundreds to thousands of genes in centimeter-sized human tissue sections. We introduce an omni-cell type cytoplasmic stain that substantially improves the segmentation of cell bodies. Our enzyme-free isothermal decoding procedure allows us to image 121 genes in large sections from the human neocortex in <10 h. We successfully recapitulated the cytoarchitecture of 20 neuronal and non-neuronal subclasses. We further performed in situ mapping of 300 genes on a diseased human kidney, profiled >20 healthy and pathological cell states, and identified diseased niches enriched in transcriptionally altered epithelial cells and myofibroblasts.

2024-03-25

Piezo1 regulates meningeal lymphatic vessel drainage and alleviates excessive CSF accumulation

Choi D, Park E, Choi J, Lu R, Yu JS, Kim C, Zhao L, Yu J, Nakashima B, Lee S, Singhal D, Scallan JP, Zhou B, Koh CJ, Lee E, Hong YK

TMC-BIDMC

Piezo1 regulates multiple aspects of the vascular system by converting mechanical signals generated by fluid flow into biological processes. Here, we find that Piezo1 is necessary for the proper development and function of meningeal lymphatic vessels and that activating Piezo1 through transgenic overexpression or treatment with the chemical agonist Yoda1 is sufficient to increase cerebrospinal fluid (CSF) outflow by improving lymphatic absorption and transport. The abnormal accumulation of CSF, which often leads to hydrocephalus and ventriculomegaly, currently lacks effective treatments. We discovered that meningeal lymphatics in mouse models of Down syndrome were incompletely developed and abnormally formed. Selective overexpression of Piezo1 in lymphatics or systemic administration of Yoda1 in mice with hydrocephalus or Down syndrome resulted in a notable decrease in pathological CSF accumulation, ventricular enlargement and other associated disease symptoms. Together, our study highlights the importance of Piezo1-mediated lymphatic mechanotransduction in maintaining brain fluid drainage and identifies Piezo1 as a promising therapeutic target for treating excessive CSF accumulation and ventricular enlargement.

2024-03-31

Application of Delta T1 maps for quantitative and objective assessment of extent of resection and survival prediction in glioblastoma

Laing BR, Prah MA, Best BJ, Krucoff MO, Mueller WM, Schmainda KM.

TTD-Harvard

Background and objectives: Gross-total resection (GTR) and low residual tumor volume (RTV) have been associated with increased survival in glioblastoma. Largely due to the subjectivity involved, the determination of GTR and RTV remains difficult in the postoperative setting. In response, the objective of this study is to evaluate the clinical efficacy of an easy-to-use MRI metric, called delta T1 (dT1), to quantify extent of resection (EOR) and RTV, in comparison to radiologist impression, to predict overall survival (OS) in glioblastoma patients. Methods: 59 patients who underwent resection of glioblastoma were retrospectively identified. Delta T1 (dT1) images, automatically created from the difference between calibrated post- and pre-contrast T1-weighted images, were used to quantify EOR and RTV. Kaplan-Meier survival estimates were determined for EOR categories, an RTV cutoff of 5cm³ and radiologist interpretation of EOR. Multivariate Cox proportional hazard regression analysis was used to evaluate RTV and EOR along with effects related to sex, KPS, MGMT, and age on OS. Results: Kaplan-Meier analysis revealed a statistically significant difference in median OS for a dT1-determined RTV cutoff of 5 cm³ (P=.0024, HR=2.18 (1.232-3.856)), but not for radiological impression (P=0.666) or dT1-determined EOR (P=0.0803), which was limited to a comparison between partial and subtotal resections. Furthermore, when covariates were accounted for in multivariate Cox regression, significant differences in OS were retained for dT1-determined RTV. Additionally, a significantly strong yet short-term effect of MGMT methylation status on OS was revealed for each RTV and EOR model. Conclusion: The utility of dT1 maps to quantify EOR and RTV in glioblastoma and predict survival, suggests an emerging role for dT1s with relevance for intraoperative MRI, neuro-navigation and postoperative disease surveillance.

2024-04-01

The Role of Endothelial Cells in Atherosclerosis: Insights from Genetic Association Studies

Pepin ME, Gupta R

DP-Harvard

Endothelial cells (ECs) mediate several biological functions that are relevant to atherosclerosis and coronary artery disease (CAD), regulating an array of vital processes including vascular tone, wound healing, reactive oxygen species, shear stress response, and inflammation. Although it is not yet known which of these functions is linked causally with CAD development and/or progression, genome-wide association studies have implicated more than 400 loci associated with CAD risk, among which several have shown EC-relevant functions. Given the arduous process of mechanistically interrogating single loci to CAD, high-throughput variant characterization methods, including pooled Clustered Regularly Interspaced Short Palindromic Repeats screens, offer exciting potential to rapidly accelerate the discovery of bona fide EC-relevant genetic loci. These discoveries in turn will broaden the therapeutic avenues for CAD beyond lipid lowering and behavioral risk modification to include EC-centric modalities of risk prevention and treatment.

2024-04-02

Imaging Mass Spectrometry of Isotopically Resolved Intact Proteins on a Trapped Ion-Mobility Quadrupole Time-of-Flight Mass Spectrometer

Klein DR, Rivera ES, Caprioli RM, Spraggins JM

TMC-Vanderbilt (Kidney)

In this work, we demonstrate rapid, high spatial, and high spectral resolution imaging of intact proteins by matrix-assisted laser desorption/ionization (MALDI) imaging mass spectrometry (IMS) on a hybrid quadrupole-reflectron time-of-flight (qTOF) mass spectrometer equipped with trapped ion mobility spectrometry (TIMS). Historically, untargeted MALDI IMS of proteins has been performed on TOF mass spectrometers. While advances in TOF instrumentation have enabled rapid, high spatial resolution IMS of intact proteins, TOF mass spectrometers generate relatively low-resolution mass spectra with limited mass accuracy. Conversely, the implementation of MALDI sources on high-resolving power Fourier transform (FT) mass spectrometers has allowed IMS experiments to be conducted with high spectral resolution with the caveat of increasingly long data acquisition times. As illustrated here, qTOF mass spectrometers enable protein imaging with the combined advantages of TOF and FT mass spectrometers. Protein isotope distributions were resolved for both a protein standard mixture and proteins detected from a whole-body mouse pup tissue section. Rapid (∼10 pixels/s) 10 μm lateral spatial resolution IMS was performed on a rat brain tissue section while maintaining isotopic spectral resolution. Lastly, proof-of-concept MALDI-TIMS data was acquired from a protein mixture to demonstrate the ability to differentiate charge states by ion mobility. These experiments highlight the advantages of qTOF and timsTOF platforms for resolving and interpreting complex protein spectra generated from tissue by IMS.

2024-04-03

ComPRePS: An Automated Cloud-based Image Analysis tool to democratize AI in Digital Pathology

Mimar S, Paul AS, Lucarelli N, Border S, Naglah A, Barisoni L, Hodgin J, Rosenberg AZ, Clapp W, Sarder P

HIVE Florida

Artificial intelligence (AI) has extensive applications in a wide range of disciplines including healthcare and clinical practice. Advances in high-resolution whole-slide brightfield microscopy allow for the digitization of histologically stained tissue sections, producing gigapixel-scale whole-slide images (WSI). The significant improvement in computing and revolution of deep neural network (DNN)-based AI technologies over the last decade allow us to integrate massively parallelized computational power, cutting-edge AI algorithms, and big data storage, management, and processing. Applied to WSIs, AI has created opportunities for improved disease diagnostics and prognostics with the ultimate goal of enhancing precision medicine and resulting patient care. The National Institutes of Health (NIH) has recognized the importance of developing standardized principles for data management and discovery for the advancement of science and proposed the Findable, Accessible, Interoperable, Reusable, (FAIR) Data Principles¹ with the goal of building a modernized biomedical data resource ecosystem to establish collaborative research communities. In line with this mission and to democratize AI-based image analysis in digital pathology, we propose ComPRePS: an end-to-end automated Computational Renal Pathology Suite which combines massive scalability, on-demand cloud computing, and an easy-to-use web-based user interface for data upload, storage, management, slide-level visualization, and domain expert interaction. Moreover, our platform is equipped with both in-house and collaborator developed sophisticated AI algorithms in the back-end server for image analysis to identify clinically relevant micro-anatomic functional tissue units (FTU) and to extract image features.

2024-04-11

An open source knowledge graph ecosystem for the life sciences

Callahan TJ, Tripodi IJ, Stefanski AL, Cappelletti L, Taneja SB, Wyrwa JM, Casiraghi E, Matentzoglu NA, Reese J, Silverstein JC, Hoyt CT, Boyce RD, Malec SA, Unni DR, Joachimiak MP, Robinson PN, Mungall CJ, Cavalleri E, Fontana T, Valentini G, Mesiti M, Gillenwater LA, Santangelo B, Vasilevsky NA, Hoehndorf R, Bennett TD, Ryan PB, Hripcsak G, Kahn MG, Bada M, Baumgartner WA Jr, Hunter LE

HIVE IEC-PSC

Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.

2024-04-23

Pangenome graph construction from genome alignments with Minigraph-Cactus

Hickey G, Monlong J, Ebler J, Novak AM, Eizenga JM, Gao Y; Human Pangenome Reference Consortium; Marschall T, Li H, Paten B

HIVE TC-CMU

Pangenome references address biases of reference genomes by storing a representative set of diverse haplotypes and their alignment, usually as a graph. Alternate alleles determined by variant callers can be used to construct pangenome graphs, but advances in long-read sequencing are leading to widely available, high-quality phased assemblies. Constructing a pangenome graph directly from assemblies, as opposed to variant calls, leverages the graph's ability to represent variation at different scales. Here we present the Minigraph-Cactus pangenome pipeline, which creates pangenomes directly from whole-genome alignments, and demonstrate its ability to scale to 90 human haplotypes from the Human Pangenome Reference Consortium. The method builds graphs containing all forms of genetic variation while allowing use of current mapping and genotyping tools. We measure the effect of the quality and completeness of reference genomes used for analysis within the pangenomes and show that using the CHM13 reference from the Telomere-to-Telomere Consortium improves the accuracy of our methods. We also demonstrate construction of a Drosophila melanogaster pangenome.

2024-04-23

Single-cell multiplex chromatin and RNA interactions in ageing human brain

Wen X, Luo Z, Zhao W, Calandrelli R, Nguyen TC, Wan X, Charles Richard JL, Zhong S

TTD-UCSD/City of Hope

Dynamically organized chromatin complexes often involve multiplex chromatin interactions and sometimes chromatin-associated RNA^1-3. Chromatin complex compositions change during cellular differentiation and ageing, and are expected to be highly heterogeneous among terminally differentiated single cells^4-7. Here we introduce the multinucleic acid interaction mapping in single cells (MUSIC) technique for concurrent profiling of multiplex chromatin interactions, gene expression and RNA-chromatin associations within individual nuclei. When applied to 14 human frontal cortex samples from older donors, MUSIC delineated diverse cortical cell types and states. We observed that nuclei exhibiting fewer short-range chromatin interactions were correlated with both an 'older' transcriptomic signature and Alzheimer's disease pathology. Furthermore, the cell type exhibiting chromatin contacts between cis expression quantitative trait loci and a promoter tends to be that in which these cis expression quantitative trait loci specifically affect the expression of their target gene. In addition, female cortical cells exhibit highly heterogeneous interactions between XIST non-coding RNA and chromosome X, along with diverse spatial organizations of the X chromosomes. MUSIC presents a potent tool for exploration of chromatin architecture and transcription at cellular resolution in complex tissues.

2024-05-24

Single-cell multiomics guided mechanistic understanding of Fontan-associated liver disease

Hu P, Rychik J, Zhao J, Bai H, Bauer A, Yu W, Rand EB, Dodds KM, Goldberg DJ, Tan K, Wilkins BJ, Pei L

TMC-CHOP

The Fontan operation is the current standard of care for single-ventricle congenital heart disease. Individuals with a Fontan circulation (FC) exhibit central venous hypertension and face life-threatening complications of hepatic fibrosis, known as Fontan-associated liver disease (FALD). The fundamental biology and mechanisms of FALD are little understood. Here, we generated a transcriptomic and epigenomic atlas of human FALD at single-cell resolution using multiomic snRNA-ATAC-seq. We found profound cell type-specific transcriptomic and epigenomic changes in FC livers. Central hepatocytes (cHep) exhibited the most substantial changes, featuring profound metabolic reprogramming. These cHep changes preceded substantial activation of hepatic stellate cells and liver fibrosis, suggesting cHep as a potential first "responder" in the pathogenesis of FALD. We also identified a network of ligand-receptor pairs that transmit signals from cHep to hepatic stellate cells, which may promote their activation and liver fibrosis. We further experimentally demonstrated that activins A and B promote fibrotic activation in vitro and identified mechanisms of activin A's transcriptional activation in FALD. Together, our single-cell transcriptomic and epigenomic atlas revealed mechanistic insights into the pathogenesis of FALD and may aid identification of potential therapeutic targets.

2024-05-25

Generation of Tailored Extracellular Matrix Hydrogels for the Study of In Vitro Folliculogenesis in Response to Matrisome-Dependent Biochemical Cues

McDowell HB, McElhinney KL, Tsui EL, Laronda MM.

While ovarian tissue cryopreservation (OTC) is an important fertility preservation option, it has its limitations. Improving OTC and ovarian tissue transplantation (OTT) must include extending the function of reimplanted tissue by reducing the extensive activation of primordial follicles (PMFs) and eliminating the risk of reimplanting malignant cells. To develop a more effective OTT, we must understand the effects of the ovarian microenvironment on folliculogenesis. Here, we describe a method for producing decellularized extracellular matrix (dECM) hydrogels that reflect the protein composition of the ovary. These ovarian dECM hydrogels were engineered to assess the effects of ECM on in vitro follicle growth, and we developed a novel method for selectively removing proteins of interest from dECM hydrogels. Finally, we validated the depletion of these proteins and successfully cultured murine follicles encapsulated in the compartment-specific ovarian dECM hydrogels and these same hydrogels depleted of EMILIN1. These are the first, optically clear, tailored tissue-specific hydrogels that support follicle survival and growth comparable to the "gold standard" alginate hydrogels. Furthermore, depleted hydrogels can serve as a novel tool for many tissue types to evaluate the impact of specific ECM proteins on cellular and molecular behavior.

2024-06-05

Role of Artificial Intelligence in Kidney Pathology: Promises and Pitfalls

Goodman K, Sarullo K, Swamidass SJ, Gaut JP, Jain S

TMC-WUSTL

2024-06-06

Mapping the cellular biogeography of human bone marrow niches using single-cell transcriptomics and proteomic imaging

Bandyopadhyay S, Duffy M, Ahn KJ, Pang M, Smith D, Duncan G, Sussman J, Zhang I, Huang J, Lin Y, Xiong B, Imtiaz T, Chen CH, Thadi A, Chen C, Xu J, Reichart M, Pillai V, Snaith O, Oldridge D, Bhattacharyya S, Maillard I, Carroll M, Nelson C, Qin L, Tan K

TMC-CHOP

Non-hematopoietic cells are essential contributors to hematopoiesis. However, heterogeneity and spatial organization of these cells in human bone marrow remain largely uncharacterized. We used single-cell RNA sequencing (scRNA-seq) to profile 29,325 non-hematopoietic cells and discovered nine transcriptionally distinct subtypes. We simultaneously profiled 53,417 hematopoietic cells and predicted their interactions with non-hematopoietic subsets. We employed co-detection by indexing (CODEX) to spatially profile over 1.2 million cells. We integrated scRNA-seq and CODEX data to link predicted cellular signaling with spatial proximity. Our analysis revealed a hyperoxygenated arterio-endosteal neighborhood for early myelopoiesis, and an adipocytic localization for early hematopoietic stem and progenitor cells (HSPCs). We used our CODEX atlas to annotate new images and uncovered mesenchymal stromal cell (MSC) expansion and spatial neighborhoods co-enriched for leukemic blasts and MSCs in acute myeloid leukemia (AML) patient samples. This spatially resolved, multiomic atlas of human bone marrow provides a reference for investigation of cellular interactions that drive hematopoiesis.

2024-06-13

Top-down proteomics

Roberts DS, Loo JA, Tsybin YO, Liu X, Wu S, Chamot-Rooke J, Agar JN, Paša-Tolić L, Smith LM, Ge Y

TTD-PNNL

Proteoforms, which arise from post-translational modifications, genetic polymorphisms and RNA splice variants, play a pivotal role as drivers in biology. Understanding proteoforms is essential to unravel the intricacies of biological systems and bridge the gap between genotypes and phenotypes. By analysing whole proteins without digestion, top-down proteomics (TDP) provides a holistic view of the proteome and can decipher protein function, uncover disease mechanisms and advance precision medicine. This Primer explores TDP, including the underlying principles, recent advances and an outlook on the future. The experimental section discusses instrumentation, sample preparation, intact protein separation, tandem mass spectrometry techniques and data collection. The results section looks at how to decipher raw data, visualize intact protein spectra and unravel data analysis. Additionally, proteoform identification, characterization and quantification are summarized, alongside approaches for statistical analysis. Various applications are described, including the human proteoform project and biomedical, biopharmaceutical and clinical sciences. These are complemented by discussions on measurement reproducibility, limitations and a forward-looking perspective that outlines areas where the field can advance, including potential future applications.

2024-06-18

Predicting transcriptional outcomes of novel multigene perturbations with GEARS

Roohani Y, Huang K, Leskovec J

TMC-Stanford

Understanding cellular responses to genetic perturbation is central to numerous biomedical applications, from identifying genetic interactions involved in cancer to developing methods for regenerative medicine. However, the combinatorial explosion in the number of possible multigene perturbations severely limits experimental interrogation. Here, we present graph-enhanced gene activation and repression simulator (GEARS), a method that integrates deep learning with a knowledge graph of gene-gene relationships to predict transcriptional responses to both single and multigene perturbations using single-cell RNA-sequencing data from perturbational screens. GEARS is able to predict outcomes of perturbing combinations consisting of genes that were never experimentally perturbed. GEARS exhibited 40% higher precision than existing approaches in predicting four distinct genetic interaction subtypes in a combinatorial perturbation screen and identified the strongest interactions twice as well as prior approaches. Overall, GEARS can predict phenotypically distinct effects of multigene perturbations and thus guide the design of perturbational experiments.

2024-06-18

Hardware and software solutions for implementing nanospray desorption electrospray ionization (nano-DESI) sources on commercial mass spectrometers

Jiang LX, Hilger RT, Laskin J

TTD-Purdue

Nanospray desorption electrospray ionization (nano-DESI) is an ambient ionization mass spectrometry imaging (MSI) approach that enables spatial mapping of biological and environmental samples with high spatial resolution and throughput. Because nano-DESI has not yet been commercialized, researchers develop their own sources and interface them with different commercial mass spectrometers. Previously, several protocols focusing on the fabrication of nano-DESI probes have been reported. In this tutorial, we discuss different hardware requirements for coupling the nano-DESI source to commercial mass spectrometers, such as the safety interlock, inlet extension, and contact closure. In addition, we describe the structure of our custom software for controlling the nano-DESI MSI platform and provide detailed instructions for its usage. With this tutorial, interested researchers should be able to implement nano-DESI experiments in their labs.

2024-07-01

PanIN or IPMN? Redefining Lesion Size in 3 Dimensions

Kiemen AL, Dequiedt L, Shen Y, Zhu Y, Matos-Romero V, Forjaz A, Campbell K, Dhana W, Cornish T, Braxton AM, Wu PH, Fishman EK, Wood LD, Wirtz D, Hruban RH

TMC-JHU

Pancreatic ductal adenocarcinoma (PDAC) develops from 2 known precursor lesions: a majority (∼85%) develops from pancreatic intraepithelial neoplasia (PanIN), and a minority develops from intraductal papillary mucinous neoplasms (IPMNs). Clinical classification of PanIN and IPMN relies on a combination of low-resolution, 3-dimensional (D) imaging (computed tomography, CT), and high-resolution, 2D imaging (histology). The definitions of PanIN and IPMN currently rely heavily on size. IPMNs are defined as macroscopic: generally >1.0 cm and visible in CT, and PanINs are defined as microscopic: generally <0.5 cm and not identifiable in CT. As 2D evaluation fails to take into account 3D structures, we hypothesized that this classification would fail in evaluation of high-resolution, 3D images. To characterize the size and prevalence of PanINs in 3D, 47 thick slabs of pancreas were harvested from grossly normal areas of pancreatic resections, excluding samples from individuals with a diagnosis of an IPMN. All patients but one underwent preoperative CT scans. Through construction of cellular resolution 3D maps, we identified >1400 ductal precursor lesions that met the 2D histologic size criteria of PanINs. We show that, when 3D space is considered, 25 of these lesions can be digitally sectioned to meet the 2D histologic size criterion of IPMN. Re-evaluation of the preoperative CT images of individuals found to possess these large precursor lesions showed that nearly half are visible on imaging. These findings demonstrate that the clinical classification of PanIN and IPMN fails in evaluation of high-resolution, 3D images, emphasizing the need for re-evaluation of classification guidelines that place significant weight on 2D assessment of 3D structures.

2024-07-01

Validation of an organ mapping antibody panel for cyclical immunofluorescence microscopy on normal human kidneys

Brewer M, Migas LG, Clouthier KA, Allen JL, Anderson DM, Pingry E, Farrow M, Quardokus EM, Spraggins JM, Van de Plas R, de Caestecker MP

TMC-Vanderbilt (Kidney)

The lack of standardization in antibody validation remains a major contributor to irreproducibility of human research. To address this, we have applied a standardized approach to validate a panel of antibodies to identify 18 major cell types and 5 extracellular matrix compartments in the human kidney by immunofluorescence (IF) microscopy. We have used these to generate an organ mapping antibody panel for two-dimensional (2-D) and three-dimensional (3-D) cyclical IF (CyCIF) to provide a more detailed method for evaluating tissue segmentation and volumes using a larger panel of markers than would normally be possible using standard fluorescence microscopy. CyCIF also makes it possible to perform multiplexed IF microscopy of whole slide images, which is a distinct advantage over other multiplexed imaging technologies that are applicable to limited fields of view. This enables a broader view of cell distributions across larger anatomical regions, allowing a better chance to capture localized regions of dysfunction in diseased tissues. These methods are broadly accessible to any laboratory with a fluorescence microscope, enabling spatial cellular phenotyping in normal and disease states. We also provide a detailed solution for image alignment between CyCIF cycles that can be used by investigators to perform these studies without programming experience using open-sourced software. This ability to perform multiplexed imaging without specialized instrumentation or computational skills opens the door to integration with more highly dimensional molecular imaging modalities such as spatial transcriptomics and imaging mass spectrometry, enabling the discovery of molecular markers of specific cell types, and how these are altered in disease.NEW & NOTEWORTHY We describe here validation criteria used to define on organ mapping panel of antibodies that can be used to define 18 cell types and five extracellular matrix compartments using cyclical immunofluorescence (CyCIF) microscopy. As CyCIF does not require specialized instrumentation, and image registration required to assemble CyCIF images can be performed by any laboratory without specialized computational skills, this technology is accessible to any laboratory with access to a fluorescence microscope and digital scanner.

2024-07-01

Exocrine Pancreas in Type 1 and Type 2 Diabetes: Different Patterns of Fibrosis, Metaplasia, Angiopathy, and Adiposity

Wright JJ, Eskaros A, Windon A, Bottino R, Jenkins R, Bradley AM, Aramandla R, Philips S, Kang H, Saunders DC, Brissova M, Powers AC

TMC-Vanderbilt (Eye/pancreas)

The endocrine and exocrine compartments of the pancreas are spatially related but functionally distinct. Multiple diseases affect both compartments, including type 1 diabetes (T1D), pancreatitis, cystic fibrosis, and pancreatic cancer. To better understand how the exocrine pancreas changes with age, obesity, and diabetes, we performed a systematic analysis of well-preserved tissue sections from the pancreatic head, body, and tail of organ donors with T1D (n = 20) or type 2 diabetes (T2D) (n = 25) and donors with no diabetes (ND; n = 74). Among ND donors, we found that the incidence of acinar-to-ductal metaplasia (ADM), angiopathy, and pancreatic adiposity increased with age, and ADM and adiposity incidence also increased with BMI. Compared with age- and sex-matched ND organs, T1D pancreata had greater rates of acinar atrophy and angiopathy, with fewer intralobular adipocytes. T2D pancreata had greater rates of ADM and angiopathy and a higher total number of T lymphocytes, but no difference in adipocyte number, compared with ND organs. Although total pancreatic fibrosis was increased in both T1D and T2D, the patterns were different, with periductal and perivascular fibrosis occurring more frequently in T1D pancreata and lobular and parenchymal fibrosis occurring more frequently in T2D. Thus, the exocrine pancreas undergoes distinct changes as individuals age or develop T1D or T2D.

2024-07-13

Terminal deoxynucleotidyl transferase and CD84 identify human multi-potent lymphoid progenitors

Kim Y, Calderon AA, Favaro P, Glass DR, Tsai AG, Ho D, Borges L, Greenleaf WJ, Bendall SC

RTI-Stanford

Lymphoid specification in human hematopoietic progenitors is not fully understood. To better associate lymphoid identity with protein-level cell features, we conduct a highly multiplexed single-cell proteomic screen on human bone marrow progenitors. This screen identifies terminal deoxynucleotidyl transferase (TdT), a specialized DNA polymerase intrinsic to VDJ recombination, broadly expressed within CD34⁺ progenitors prior to B/T cell emergence. While these TdT⁺ cells coincide with granulocyte-monocyte progenitor (GMP) immunophenotype, their accessible chromatin regions show enrichment for lymphoid-associated transcription factor (TF) motifs. TdT expression on GMPs is inversely related to the SLAM family member CD84. Prospective isolation of CD84^lo GMPs demonstrates robust lymphoid potentials ex vivo, while still retaining significant myeloid differentiation capacity, akin to LMPPs. This multi-omic study identifies human bone marrow lymphoid-primed progenitors, further defining the lympho-myeloid axis in human hematopoiesis.

2024-07-18

Leveraging neighborhood representations of single-cell data to achieve sensitive DE testing with miloDE

Missarova A, Dann E, Rosen L, Satija R, Marioni J

HIVE MC-NYGC

Single-cell RNA-sequencing enables testing for differential expression (DE) between conditions at a cell type level. While powerful, one of the limitations of such approaches is that the sensitivity of DE testing is dictated by the sensitivity of clustering, which is often suboptimal. To overcome this, we present miloDE-a cluster-free framework for DE testing (available as an open-source R package). We illustrate the performance of miloDE on both simulated and real data. Using miloDE, we identify a transient hemogenic endothelia-like state in mouse embryos lacking Tal1 and detect distinct programs during macrophage activation in idiopathic pulmonary fibrosis.

2024-07-23

Integration of spatial and single-cell data across modalities with weakly linked features

Nolan G, Ma Z, Zhang N

TMC-Stanford

Although single-cell and spatial sequencing methods enable simultaneous measurement of more than one biological modality, no technology can capture all modalities within the same cell. For current data integration methods, the feasibility of cross-modal integration relies on the existence of highly correlated, a priori 'linked' features. We describe matching X-modality via fuzzy smoothed embedding (MaxFuse), a cross-modal data integration method that, through iterative coembedding, data smoothing and cell matching, uses all information in each modality to obtain high-quality integration even when features are weakly linked. MaxFuse is modality-agnostic and demonstrates high robustness and accuracy in the weak linkage scenario, achieving 20~70% relative improvement over existing methods under key evaluation metrics on benchmarking datasets. A prototypical example of weak linkage is the integration of spatial proteomic data with single-cell sequencing data. On two example analyses of this type, MaxFuse enabled the spatial consolidation of proteomic, transcriptomic and epigenomic information at single-cell resolution on the same tissue section.

2024-07-25

Local, Sustained, and Targeted Co-Delivery of MEK Inhibitor and Doxorubicin Inhibits Tumor Progression in E-Cadherin-Positive Breast Cancer

Kuhn PM, Russo GC, Crawford AJ, Venkatraman A, Yang N, Starich BA, Schneiderman Z, Wu PH, Vo T, Wirtz D, Kokkoli E

TMC-JHU

Effectively utilizing MEK inhibitors in the clinic remains challenging due to off-target toxicity and lack of predictive biomarkers. Recent findings propose E-cadherin, a breast cancer diagnostic indicator, as a predictor of MEK inhibitor success. To address MEK inhibitor toxicity, traditional methodologies have systemically delivered nanoparticles, which require frequent, high-dose injections. Here, we present a different approach, employing a thermosensitive, biodegradable hydrogel with functionalized liposomes for local, sustained release of MEK inhibitor PD0325901 and doxorubicin. The poly(δ-valerolactone-co-lactide)-b-poly(ethylene-glycol)-b-poly(δ-valerolactone-co-lactide) triblock co-polymer gels at physiological temperature and has an optimal degradation time in vivo. Liposomes were functionalized with PR_b, a biomimetic peptide targeting the α₅β₁ integrin receptor, which is overexpressed in E-cadherin-positive triple negative breast cancer (TNBC). In various TNBC models, the hydrogel-liposome system delivered via local injection reduced tumor progression and improved animal survival without toxic side effects. Our work presents the first demonstration of local, sustained delivery of MEK inhibitors to E-cadherin-positive tumors alongside traditional chemotherapeutics, offering a safe and promising therapeutic strategy.

2024-07-29

Signal amplification by cyclic extension enables high-sensitivity single-cell mass cytometry

Lun XK, Sheng K, Yu X, Lam CY, Gowri G, Serrata M, Zhai Y, Su H, Luan J, Kim Y, Ingber DE, Jackson HW, Yaffe MB, Yin P

TTD-Harvard

Mass cytometry uses metal-isotope-tagged antibodies to label targets of interest, which enables simultaneous measurements of ~50 proteins or protein modifications in millions of single cells, but its sensitivity is limited. Here, we present a signal amplification technology, termed Amplification by Cyclic Extension (ACE), implementing thermal-cycling-based DNA in situ concatenation in combination with 3-cyanovinylcarbazole phosphoramidite-based DNA crosslinking to enable signal amplification simultaneously on >30 protein epitopes. We demonstrate the utility of ACE in low-abundance protein quantification with suspension mass cytometry to characterize molecular reprogramming during the epithelial-to-mesenchymal transition as well as the mesenchymal-to-epithelial transition. We show the capability of ACE to quantify the dynamics of signaling network responses in human T lymphocytes. We further present the application of ACE in imaging mass cytometry-based multiparametric tissue imaging to identify tissue compartments and profile spatial aspects related to pathological states in polycystic kidney tissues.

2024-08-07

MALDI TIMS IMS Reveals Ganglioside Molecular Diversity within Murine S. aureus Kidney Tissue Abscesses

Djambazova KV, Gibson-Corley KN, Freiberg JA, Caprioli RM, Skaar EP, Spraggins JM.

TMC-Vanderbilt (Kidney)

Gangliosides play important roles in innate and adaptive immunity. The high degree of structural heterogeneity results in significant variability in ganglioside expression patterns and greatly complicates linking structure and function. Structural characterization at the site of infection is essential in elucidating host ganglioside function in response to invading pathogens, such as Staphylococcus aureus (S. aureus). Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) enables high-specificity spatial investigation of intact gangliosides. Here, ganglioside structural and spatial heterogeneity within an S. aureus-infected mouse kidney abscess was characterized. Differences in spatial distributions were observed for gangliosides of different classes and those that differ in ceramide chain composition and oligosaccharide-bound sialic acid. Furthermore, integrating trapped ion mobility spectrometry (TIMS) allowed for the gas-phase separation and visualization of monosialylated ganglioside isomers that differ in sialic acid type and position. The isomers differ in spatial distributions within the host-pathogen interface, where molecular patterns revealed new molecular zones in the abscess previously unidentified by traditional histology.

2024-08-11

Toward universal cell embeddings: integrating single-cell RNA-seq datasets across species with SATURN

Rosen Y, Brbić M, Roohani Y, Swanson K, Li Z, Leskovec J

TMC-Stanford

Analysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, interspecies genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models. By coupling protein embeddings from language models with RNA expression, SATURN integrates datasets profiled from different species regardless of their genomic similarity. SATURN can detect functionally related genes coexpressed across species, redefining differential expression for cross-species analysis. Applying SATURN to three species whole-organism atlases and frog and zebrafish embryogenesis datasets, we show that SATURN can effectively transfer annotations across species, even when they are evolutionarily remote. We also demonstrate that SATURN can be used to find potentially divergent gene functions between glaucoma-associated genes in humans and four other species.

2024-08-13

Coupling Microdroplet-Based Sample Preparation, Multiplexed Isobaric Labeling, and Nanoflow Peptide Fractionation for Deep Proteome Profiling of the Tissue Microenvironment

Veličković M, Fillmore TL, Attah IK, Posso C, Pino JC, Zhao R, Williams SM, Veličković D, Jacobs JM, Burnum-Johnson KE, Zhu Y, Piehowski PD

TTD-Purdue

There is increasing interest in developing in-depth proteomic approaches for mapping tissue heterogeneity in a cell-type-specific manner to better understand and predict the function of complex biological systems such as human organs. Existing spatially resolved proteomics technologies cannot provide deep proteome coverage due to limited sensitivity and poor sample recovery. Herein, we seamlessly combined laser capture microdissection with a low-volume sample processing technology that includes a microfluidic device named microPOTS (microdroplet processing in one pot for trace samples), multiplexed isobaric labeling, and a nanoflow peptide fractionation approach. The integrated workflow allowed us to maximize proteome coverage of laser-isolated tissue samples containing nanogram levels of proteins. We demonstrated that the deep spatial proteomics platform can quantify more than 5000 unique proteins from a small-sized human pancreatic tissue pixel (∼60,000 μm²) and differentiate unique protein abundance patterns in pancreas. Furthermore, the use of the microPOTS chip eliminated the requirement for advanced microfabrication capabilities and specialized nanoliter liquid handling equipment, making it more accessible to proteomic laboratories.

2024-08-13

Patterns of forearm lymphatic drainage to the epitrochlear lymph nodes in 1400 cutaneous melanoma patients

Fanning JE, Singhal D, Reynolds HM, Don TDJ, Donohoe KJ, Suami H, Chung DKV

TMC-BIDMC

Background: Variations of hand and forearm lymphatic drainage to upper-arm lymphatic pathways may impact the route of melanoma metastasis. This study compared rates of lymphatic drainage to epitrochlear nodes between anatomic divisions of the hand and forearm to determine whether the anatomic distribution of hand and forearm melanomas affects the likelihood of drainage to epitrochlear lymph nodes. Methods: Using a single-institution lymphoscintigraphy database, we identified all patients with cutaneous melanoma on the hand and forearm. A body-map two-dimensional coordinate system was used to classify cutaneous melanoma sites between radial-ulnar and dorsal-volar divisions. Sentinel lymph nodes (SLNs) visualized on lymphoscintigraphy were recorded. Proportions of patients with epitrochlear SLNs were compared between anatomic divisions using χ² analysis. Results: Of 3628 upper extremity cutaneous melanoma patients who underwent lymphatic mapping with lymphoscintigraphy, 1400 met inclusion criteria. Twenty-one percent of patients demonstrated epitrochlear SLNs. Epitrochlear SLNs were observed in 27% of dorsal forearm melanomas and 15% of volar forearm melanomas (p < 0.001). Epitrochlear SLNs were observed in 31% of ulnar forearm melanomas and 17% of radial forearm melanomas (p < 0.001). Conclusions: Higher proportions of dorsal and ulnar forearm melanomas have epitrochlear SLNs. Metastasis to epitrochlear SLNs may be more likely from melanomas in these respective forearm regions.

2024-08-22

Heterogeneity of ovarian matrisome hydrogels elucidates factors that may influence follicle growth in vitro

McDowell HB, Henning NF, Laronda MM

This work describes a valuable and reproducible method for generating optically clear bovine ovary-derived hydrogels that support in vitro murine follicle growth. These techniques are the foundation in which follicle growth dynamics and matrisome protein composition may be correlated to reveal the influence of matrisome proteins on folliculogenesis.

2024-08-26

Single-cell analysis of chromatin and expression reveals age- and sex-associated alterations in the human heart

Read DF, Booth GT, Daza RM, Jackson DL, Gladden RG, Srivatsan SR, Ewing B, Franks JM, Spurrell CH, Gomes AR, O'Day D, Gogate AA, Martin BK, Larson H, Pfleger C, Starita L, Lin Y, Shendure J, Lin S, Trapnell C

TMC-Cal Tech

Sex differences and age-related changes in the human heart at the tissue, cell, and molecular level have been well-documented and many may be relevant for cardiovascular disease. However, how molecular programs within individual cell types vary across individuals by age and sex remains poorly characterized. To better understand this variation, we performed single-nucleus combinatorial indexing (sci) ATAC- and RNA-Seq in human heart samples from nine donors. We identify hundreds of differentially expressed genes by age and sex and find epigenetic signatures of variation in ATAC-Seq data in this discovery cohort. We then scale up our single-cell RNA-Seq analysis by combining our data with five recently published single nucleus RNA-Seq datasets of healthy adult hearts. We find variation such as metabolic alterations by sex and immune changes by age in differential expression tests, as well as alterations in abundance of cardiomyocytes by sex and neurons with age. In addition, we compare our adult-derived ATAC-Seq profiles to analogous fetal cell types to identify putative developmental-stage-specific regulatory factors. Finally, we train predictive models of cell-type-specific RNA expression levels utilizing ATAC-Seq profiles to link distal regulatory sequences to promoters, quantifying the predictive value of a simple TF-to-expression regulatory grammar and identifying cell-type-specific TFs. Our analysis represents the largest single-cell analysis of cardiac variation by age and sex to date and provides a resource for further study of healthy cardiac variation and transcriptional regulation at single-cell resolution.

2024-08-27

Multiplexed in situ protein imaging using DNA-barcoded antibodies with extended hybridization chain reactions

Wang Y, Liu X, Zeng Y, Saka SK, Xie W, Goldaracena I, Kohman RE, Yin P, Church GM

TTD-Harvard

Antibodies have long served as vital tools in biological and clinical laboratories for the specific detection of proteins. Conventional methods employ fluorophore or horseradish peroxidase-conjugated antibodies to detect signals. More recently, DNA-conjugated antibodies have emerged as a promising technology, capitalizing on the programmability and amplification capabilities of DNA to enable highly multiplexed and ultrasensitive protein detection. However, the nonspecific binding of DNA-conjugated antibodies has impeded the widespread adoption of this approach. Here, we present a novel DNA-conjugated antibody staining protocol that addresses these challenges and demonstrates superior performance in suppressing nonspecific signals compared to previously published protocols. We further extend the utility of DNA-conjugated antibodies for signal-amplified in situ protein imaging through the hybridization chain reaction (HCR) and design a novel HCR DNA pair to expand the HCR hairpin pool from the previously published 5 pairs to 13, allowing for flexible hairpin selection and higher multiplexing. Finally, we demonstrate highly multiplexed in situ protein imaging using these techniques in both cultured cells and tissue sections.

2024-08-30

Investigating quantitative histological characteristics in renal pathology using HistoLens

Border SP, Tomaszewski JE, Yoshida T, Kopp JB, Hodgin JB, Clapp WL, Rosenberg AZ, Buyon JP, Sarder P

HIVE TC-Florida

HistoLens is an open-source graphical user interface developed using MATLAB AppDesigner for visual and quantitative analysis of histological datasets. HistoLens enables users to interrogate sets of digitally annotated whole slide images to efficiently characterize histological differences between disease and experimental groups. Users can dynamically visualize the distribution of 448 hand-engineered features quantifying color, texture, morphology, and distribution across microanatomic sub-compartments. Additionally, users can map differentially detected image features within the images by highlighting affected regions. We demonstrate the utility of HistoLens to identify hand-engineered features that correlate with pathognomonic renal glomerular characteristics distinguishing diabetic nephropathy and amyloid nephropathy from the histologically unremarkable glomeruli in minimal change disease. Additionally, we examine the use of HistoLens for glomerular feature discovery in the Tg26 mouse model of HIV-associated nephropathy. We identify numerous quantitative glomerular features distinguishing Tg26 transgenic mice from wild-type mice, corresponding to a progressive renal disease phenotype. Thus, we demonstrate an off-the-shelf and ready-to-use toolkit for quantitative renal pathology applications.

2024-08-30

Markov models for clinical decision-making in radiation oncology: A systematic review

McCullum LB, Karagoz A, Dede C, Garcia R, Nosrat F, Hemmati M, Hosseinian S, Schaefer AJ, Fuller CD; Rice/MD Anderson Center for Operations Research in Cancer (CORC); MD Anderson Head and Neck Cancer Symptom Working Group

HIVE IEC-PSC

The intrinsic stochasticity of patients' response to treatment is a major consideration for clinical decision-making in radiation therapy. Markov models are powerful tools to capture this stochasticity and render effective treatment decisions. This paper provides an overview of the Markov models for clinical decision analysis in radiation oncology. A comprehensive literature search was conducted within MEDLINE using PubMed, following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Only studies published from 2000 to 2023 were considered. Selected publications were summarized in two categories: (i) studies that compare two (or more) fixed treatment policies using Monte Carlo simulation and (ii) studies that seek an optimal treatment policy through Markov Decision Processes (MDPs). Relevant to the scope of this study, 61 publications were selected for detailed review. The majority of these publications (n = 56) focused on comparative analysis of two or more fixed treatment policies using Monte Carlo simulation. Classifications based on cancer site, utility measures and the type of sensitivity analysis are presented. Five publications considered MDPs with the aim of computing an optimal treatment policy; a detailed statement of the analysis and results is provided for each work. As an extension of Markov model-based simulation analysis, MDP offers a flexible framework to identify an optimal treatment policy among a possibly large set of treatment policies. However, the applications of MDPs to oncological decision-making have been understudied, and the full capacity of this framework to render complex optimal treatment decisions warrants further consideration.

2024-09-04

E-Cadherin Induces Serine Synthesis to Support Progression and Metastasis of Breast Cancer

Lee G, Wong C, Cho A, West JJ, Crawford AJ, Russo GC, Si BR, Kim J, Hoffner L, Jang C, Jung M, Leone RD, Konstantopoulos K, Ewald AJ, Wirtz D, Jeong S

TMC-JHU

The loss of E-cadherin, an epithelial cell adhesion molecule, has been implicated in metastasis by mediating the epithelial-mesenchymal transition, which promotes invasion and migration of cancer cells. However, recent studies have demonstrated that E-cadherin supports the survival and proliferation of metastatic cancer cells. Here, we identified a metabolic role for E-cadherin in breast cancer by upregulating the de novo serine synthesis pathway (SSP). The upregulated SSP provided metabolic precursors for biosynthesis and resistance to oxidative stress, enabling E-cadherin+ breast cancer cells to achieve faster tumor growth and enhanced metastases. Inhibition of phosphoglycerate dehydrogenase, a rate-limiting enzyme in the SSP, significantly and specifically hampered proliferation of E-cadherin+ breast cancer cells and rendered them vulnerable to oxidative stress, inhibiting their metastatic potential. These findings reveal that E-cadherin reprograms cellular metabolism, promoting tumor growth and metastasis of breast cancers. Significance: E-Cadherin promotes the progression and metastasis of breast cancer by upregulating the de novo serine synthesis pathway, offering promising targets for inhibiting tumor growth and metastasis in E-cadherin-expressing tumors.

2024-09-24

Collateralization of the upper extremity lymphatic system after axillary lymph node dissection

Fanning JE, Chung DKV, Reynolds HM, Jayathungage Don TD, Suami H, Donohoe KJ, Singhal D

TMC-BIDMC

Background: Lymphatic drainage from the arm may be altered after axillary lymph node dissection (ALND). Understanding these alterations is important as they may change standard surgical and radiation treatment in recurrent breast cancer or upper extremity skin cancers, including melanoma. Methods: Utilizing a single-institution planar and single photon emission computed tomography/computed tomography lymphoscintigraphy database, we identified patients with a diagnosis of upper extremity cutaneous melanoma from 2008 to 2023 who previously underwent ALND for cancer treatment and did not develop upper extremity cancer-related lymphedema. ALND patients were matched to control patients presenting with cutaneous melanomas at the same anatomic sites. Sentinel lymph nodes (SLNs) were compared between both groups. Results: Of 3628 upper extremity melanoma cutaneous patients, 934 met inclusion criteria, including 22 ALND and 912 control patients. Level I axillary SLN drainage was observed in 98% of controls and 27% of ALND patients (p < 0.001). Level II axillary SLN drainage was observed in 3% of controls and 27% of ALND patients (p < 0.001). Level III axillary SLN drainage was observed in 1% of controls and 32% of ALND patients (p < 0.001). Epitrochlear SLN drainage was observed in 9% of controls and 32% of ALND patients, respectively (p < 0.046). Brachial SLN drainage was observed in 4% of controls and 23% of ALND patients (p < 0.001). Conclusions: Distinct changes in functional lymphatic drainage were seen between the arms of patients who previously underwent ALND versus control patients. Levels II and III axillary, epitrochlear, and brachial nodes are possible sites of metastatic disease that should be considered in patients with a prior ALND.

2024-09-27

Vitessce: integrative visualization of multimodal and spatially resolved single-cell data

Keller MS, Gold I, McCallum C, Manz T, Kharchenko PV, Gehlenborg N

HIVE TC-Harvard

Multiomics technologies with single-cell and spatial resolution make it possible to measure thousands of features across millions of cells. However, visual analysis of high-dimensional transcriptomic, proteomic, genome-mapped and imaging data types simultaneously remains a challenge. Here we describe Vitessce, an interactive web-based visualization framework for exploration of multimodal and spatially resolved single-cell data. We demonstrate integrative visualization of millions of data points, including cell-type annotations, gene expression quantities, spatially resolved transcripts and cell segmentations, across multiple coordinated views. The open-source software is available at http://vitessce.io .

2024-10-01

vSPACE: exploring virtual spatial representation of articular chondrocytes at the single-cell level

Zhang C, Wang H, Chung Y, Hong SH, Olmer M, Swahn H, Lotz M, Maye P, Rowe D, Shin DG

TMC-UConn/Scripps

Summary: vSPACE is a web-based application presenting a spatial representation of scRNAseq data obtained from human articular cartilage by emulating the concept of spatial transcriptomics technology, but virtually. This virtual 2D plot presentation of human articular cartage cells generates several zonal distribution patterns, for one or multiple genes at a time, revealing patterns that scientists can appreciate as imputed spatial distribution patterns along the zonal axis. Availability and implementation: vSPACE is implemented in Python Dash as a web-based toolbox designed for data visualization of zonal gene expression patterns in articular cartilage chondrocytes. This tool is freely accessible at: https://vspace.cse.uconn.edu/The source code and extra materials for this service can be downloaded from: https://github.com/zhacheny/vSPACE.

2024-10-09

Spatial transcriptomics in health and disease

Jain S, Eadon MT

TMC-WUSTL

The ability to localize hundreds of macromolecules to discrete locations, structures and cell types in a tissue is a powerful approach to understand the cellular and spatial organization of an organ. Spatially resolved transcriptomic technologies enable mapping of transcripts at single-cell or near single-cell resolution in a multiplex manner. The rapid development of spatial transcriptomic technologies has accelerated the pace of discovery in several fields, including nephrology. Its application to preclinical models and human samples has provided spatial information about new cell types discovered by single-cell sequencing and new insights into the cell-cell interactions within neighbourhoods, and has improved our understanding of the changes that occur in response to injury. Integration of spatial transcriptomic technologies with other omics methods, such as proteomics and spatial epigenetics, will further facilitate the generation of comprehensive molecular atlases, and provide insights into the dynamic relationships of molecular components in homeostasis and disease. This Review provides an overview of current and emerging spatial transcriptomic methods, their applications and remaining challenges for the field.

2024-10-22

Thermal Denaturation of Fresh Frozen Tissue Enhances Mass Spectrometry Detection of Peptides

Kruse ARS, Judd AM, Gutierrez DB, Allen JL, Dufresne M, Farrow MA, Powers AC, Norris JL, Caprioli RM, Spraggins JM

TMC-Vanderbilt (Eye/pancreas)

Thermal denaturation (TD), known as antigen retrieval, heats tissue samples in a buffered solution to expose protein epitopes. Thermal denaturation of formalin-fixed paraffin-embedded samples enhances on-tissue tryptic digestion, increasing peptide detection using matrix-assisted laser desorption ionization imaging mass spectrometry (MALDI IMS). We investigated the tissue-dependent effects of TD on peptide MALDI IMS and liquid chromatography-tandem mass spectrometry signal in unfixed, frozen human colon, ovary, and pancreas tissue. In a triplicate experiment using time-of-flight, orbitrap, and Fourier-transform ion cyclotron resonance mass spectrometry platforms, we found that TD had a tissue-dependent effect on peptide signal, resulting in a (22.5%) improvement in peptide detection from the colon, a (73.3%) improvement in ovary tissue, and a (96.6%) improvement in pancreas tissue. Biochemical analysis of identified peptides shows that TD facilitates identification of hydrophobic peptides.

2024-10-26

The UCSC Genome Browser database: 2025 update

Perez G, Barber GP, Benet-Pages A, Casper J, Clawson H, Diekhans M, Fischer C, Gonzalez JN, Hinrichs AS, Lee CM, Nassar LR, Raney BJ, Speir ML, van Baren MJ, Vaske CJ, Haussler D, Kent WJ, Haeussler M

HIVE TC-CMU

The UCSC Genome Browser (https://genome.ucsc.edu) is a widely utilized web-based tool for visualization and analysis of genomic data, encompassing over 4000 assemblies from diverse organisms. Since its release in 2001, it has become an essential resource for genomics and bioinformatics research. Annotation data available on Genome Browser includes both internally created and maintained tracks as well as custom tracks and track hubs provided by the research community. This last year's updates include over 25 new annotation tracks such as the gnomAD 4.1 track on the human GRCh38/hg38 assembly, the addition of three new public hubs, and significant expansions to the Genome Archive[GenArk) system for interacting with the enormous variety of assemblies. We have also made improvements to our interface, including updates to the browser graphic page, such as a new popup dialog feature that now displays item details without requiring navigation away from the main Genome Browser page. GenePred tracks have been upgraded with right-click options for zooming and precise navigation, along with enhanced mouseOver functions. Additional improvements include a new grouping feature for track hubs and hub description info links. A new tutorial focusing on Clinical Genetics has also been added to the UCSC Genome Browser.

2024-10-30

Consensus tissue domain detection in spatial omics data using multiplex image labeling with regional morphology (MILWRM)

Kaur H, Heiser CN, McKinley ET, Ventura-Antunes L, Harris CR, Roland JT, Farrow MA, Selden HJ, Pingry EL, Moore JF, Ehrlich LIR, Shrubsole MJ, Spraggins JM, Coffey RJ, Lau KS, Vandekar SN

TMC-Vanderbilt (Kidney)

Spatially resolved molecular assays provide high dimensional genetic, transcriptomic, proteomic, and epigenetic information in situ and at various resolutions. Pairing these data across modalities with histological features enables powerful studies of tissue pathology in the context of an intact microenvironment and tissue structure. Increasing dimensions across molecular analytes and samples require new data science approaches to functionally annotate spatially resolved molecular data. A specific challenge is data-driven cross-sample domain detection that allows for analysis within and between consensus tissue compartments across high volumes of multiplex datasets stemming from tissue atlasing efforts. Here, we present MILWRM (multiplex image labeling with regional morphology)-a Python package for rapid, multi-scale tissue domain detection and annotation at the image- or spot-level. We demonstrate MILWRM's utility in identifying histologically distinct compartments in human colonic polyps, lymph nodes, mouse kidney, and mouse brain slices through spatially-informed clustering in two different spatial data modalities from different platforms. We used tissue domains detected in human colonic polyps to elucidate the molecular distinction between polyp subtypes, and explored the ability of MILWRM to identify anatomical regions of the brain tissue and their respective distinct molecular profiles.

2024-11-01

Mapping the Anatomy of the Human Lymphatic System

Bustos VP, Wang R, Pardo J, Boppana A, Weber G, Itkin M, Singhal D

TMC-BIDMC

Background: While substantial anatomical study has been pursued throughout the human body, anatomical study of the human lymphatic system remains in its infancy. For microsurgeons specializing in lymphatic surgery, a better command of lymphatic anatomy is needed to further our ability to offer surgical interventions with precision. In an effort to facilitate the dissemination and advancement of human lymphatic anatomy knowledge, our teams worked together to create a map. The aim of this paper is to present our experience in mapping the anatomy of the human lymphatic system. Methods: Three steps were followed to develop a modern map of the human lymphatic system: (1) identifying our source material, which was "Anatomy of the human lymphatic system," published by Rouvière and Tobias (1938), (2) choosing a modern platform, the Miro Mind Map software, to integrate the source material, and (3) transitioning our modern platform into The Human BioMolecular Atlas Program (HuBMAP). Results: The map of lymphatic anatomy based on the Rouvière textbook contained over 900 data points. Specifically, the map contained 404 channels, pathways, or trunks and 309 lymph node groups. Additionally, lymphatic drainage from 165 distinct anatomical regions were identified and integrated into the map. The map is being integrated into HuBMAP by creating a standard data format called an Anatomical Structures, Cell Types, plus Biomarkers table for the lymphatic vasculature, which is currently in the process of construction. Conclusion: Through a collaborative effort, we have developed a unified and centralized source for lymphatic anatomy knowledge available to the entire scientific community. We believe this resource will ultimately advance our knowledge of human lymphatic anatomy while simultaneously highlighting gaps for future research. Advancements in lymphatic anatomy knowledge will be critical for lymphatic surgeons to further refine surgical indications and operative approaches.

2024-11-01

Mechanisms of assembly and remodelling of the extracellular matrix

Naba A

DP-Illinois

The extracellular matrix (ECM) is the complex meshwork of proteins and glycans that forms the scaffold that surrounds and supports cells. It exerts key roles in all aspects of metazoan physiology, from conferring physical and mechanical properties on tissues and organs to modulating cellular processes such as proliferation, differentiation and migration. Understanding the mechanisms that orchestrate the assembly of the ECM scaffold is thus crucial to understand ECM functions in health and disease. This Review discusses novel insights into the compositional diversity of matrisome components and the mechanisms that lead to tissue-specific assemblies and architectures tailored to support specific functions. The Review then highlights recently discovered mechanisms, including post-translational modifications and metabolic pathways such as amino acid availability and the circadian clock, that modulate ECM secretion, assembly and remodelling in homeostasis and human diseases. Last, the Review explores the potential of 'matritherapies', that is, strategies to normalize ECM composition and architecture to achieve a therapeutic benefit.

2024-11-12

Personalized pangenome references

Sirén J, Eskandar P, Ungaro MT, Hickey G, Eizenga JM, Novak AM, Chang X, Chang PC, Kolmogorov M, Carroll A, Monlong J, Paten B

HIVE TC-CMU

Pangenomes reduce reference bias by representing genetic diversity better than a single reference sequence. Yet when comparing a sample to a pangenome, variants in the pangenome that are not part of the sample can be misleading, for example, causing false read mappings. These irrelevant variants are generally rarer in terms of allele frequency, and have previously been dealt with by filtering rare variants. However, this blunt heuristic both fails to remove some irrelevant variants and removes many relevant variants. We propose a new approach that imputes a personalized pangenome subgraph by sampling local haplotypes according to k-mer counts in the reads. We implement the approach in the vg toolkit ( https://github.com/vgteam/vg ) for the Giraffe short-read aligner and compare its accuracy to state-of-the-art methods using human pangenome graphs from the Human Pangenome Reference Consortium. This reduces small variant genotyping errors by four times relative to the Genome Analysis Toolkit and makes short-read structural variant genotyping of known variants competitive with long-read variant discovery methods.

2024-11-14

Spatially exploring RNA biology in archival formalin-fixed paraffin-embedded tissues

Bai Z, Zhang D, Gao Y, Tao B, Zhang D, Bao S, Enninful A, Wang Y, Li H, Su G, Tian X, Zhang N, Xiao Y, Liu Y, Gerstein M, Li M, Xing Y, Lu J, Xu ML, Fan R

TTD-Yale

The capability to spatially explore RNA biology in formalin-fixed paraffin-embedded (FFPE) tissues holds transformative potential for histopathology research. Here, we present pathology-compatible deterministic barcoding in tissue (Patho-DBiT) by combining in situ polyadenylation and computational innovation for spatial whole transcriptome sequencing, tailored to probe the diverse RNA species in clinically archived FFPE samples. It permits spatial co-profiling of gene expression and RNA processing, unveiling region-specific splicing isoforms, and high-sensitivity transcriptomic mapping of clinical tumor FFPE tissues stored for 5 years. Furthermore, genome-wide single-nucleotide RNA variants can be captured to distinguish malignant subclones from non-malignant cells in human lymphomas. Patho-DBiT also maps microRNA regulatory networks and RNA splicing dynamics, decoding their roles in spatial tumorigenesis. Single-cell level Patho-DBiT dissects the spatiotemporal cellular dynamics driving tumor clonal architecture and progression. Patho-DBiT stands poised as a valuable platform to unravel rich RNA biology in FFPE tissues to aid in clinical pathology evaluation.

2024-11-15

Proteomic insights into the extracellular matrix: a focus on proteoforms and their implications in health and disease

Bains AK, Naba A

DP-Illinois

Introduction: The extracellular matrix (ECM) is a highly organized and dynamic network of proteins and glycosaminoglycans that provides critical structural, mechanical, and biochemical support to cells. The functions of the ECM are directly influenced by the conformation of the proteins that compose it. ECM proteoforms, which can result from genetic, transcriptional, and/or post-translational modifications, adopt different conformations and, consequently, confer different structural properties and functionalities to the ECM in both physiological and pathological contexts. Areas covered: In this review, we discuss how bottom-up proteomics has been applied to identify, map, and quantify post-translational modifications (e.g. additions of chemical groups, proteolytic cleavage, or cross-links) and ECM proteoforms arising from alternative splicing or genetic variants. We further illustrate how proteoform-level information can be leveraged to gain novel insights into ECM protein structure and ECM functions in health and disease. Expert opinion: In the Expert opinion section, we discuss remaining challenges and opportunities with an emphasis on the importance of devising experimental and computational methods tailored to account for the unique biochemical properties of ECM proteins with the goal of increasing sequence coverage and, hence, accurate ECM proteoform identification.

Keywords: Alternative splicing; degradomics; isoforms; mass spectrometry; matrisome; post-translational modifications; sequence coverage; single amino acid variants.

2024-11-18

MatrixDB 2024: an increased coverage of extracellular matrix interactions, a new Network Explorer and a new web interface

Samarasinghe KW, Kotlyar M, Vallet SD, Hayes C, Naba A, Jurisica I, Lisacek F, Ricard-Blum S

DP-Illinois

MatrixDB, a member of the International Molecular Exchange consortium (IMEx), is a curated interaction database focused on interactions established by extracellular matrix (ECM) constituents including proteins, proteoglycans, glycosaminoglycans and ECM bioactive fragments. The architecture of MatrixDB was upgraded to ease interaction data export, allow versioning and programmatic access and ensure sustainability. The new version of the database includes more than twice the number of manually curated and experimentally-supported interactions. High-confidence predicted interactions were imported from the Integrated Interactions Database to increase the coverage of the ECM interactome. ECM and ECM-associated proteins of five species (human, murine, bovine, avian and zebrafish) were annotated with matrisome divisions and categories, which are used for computational analyses of ECM -omic datasets. Biological pathways from the Reactome Pathway Knowledgebase were also added to the biomolecule description. New transcriptomic and expanded proteomic datasets were imported in MatrixDB to generate cell- and tissue-specific ECM networks using the newly developed in-house Network Explorer integrated in the database. MatrixDB is freely available at https://matrixdb.univ-lyon1.fr.

2024-11-19

Prediction of single-cell RNA expression profiles in live cells by Raman microscopy with Raman2RNA

Kobayashi-Kirschvink KJ, Comiter CS, Gaddam S, Joren T, Grody EI, Ounadjela JR, Zhang K, Ge B, Kang JW, Xavier RJ, So PTC, Biancalani T, Shu J, Regev A

RTI-Broad

Single-cell RNA sequencing and other profiling assays have helped interrogate cells at unprecedented resolution and scale, but are inherently destructive. Raman microscopy reports on the vibrational energy levels of proteins and metabolites in a label-free and nondestructive manner at subcellular spatial resolution, but it lacks genetic and molecular interpretability. Here we present Raman2RNA (R2R), a method to infer single-cell expression profiles in live cells through label-free hyperspectral Raman microscopy images and domain translation. We predict single-cell RNA sequencing profiles nondestructively from Raman images using either anchor-based integration with single molecule fluorescence in situ hybridization, or anchor-free generation with adversarial autoencoders. R2R outperformed inference from brightfield images (cosine similarities: R2R >0.85 and brightfield <0.15). In reprogramming of mouse fibroblasts into induced pluripotent stem cells, R2R inferred the expression profiles of various cell states. With live-cell tracking of mouse embryonic stem cell differentiation, R2R traced the early emergence of lineage divergence and differentiation trajectories, overcoming discontinuities in expression space. R2R lays a foundation for future exploration of live genomic dynamics.

2024-11-22

CelloType: a unified model for segmentation and classification of tissue images

Pang M, Roy TK, Wu X, Tan K

TMC-CHOP

Cell segmentation and classification are critical tasks in spatial omics data analysis. Here we introduce CelloType, an end-to-end model designed for cell segmentation and classification for image-based spatial omics data. Unlike the traditional two-stage approach of segmentation followed by classification, CelloType adopts a multitask learning strategy that integrates these tasks, simultaneously enhancing the performance of both. CelloType leverages transformer-based deep learning techniques for improved accuracy in object detection, segmentation and classification. It outperforms existing segmentation methods on a variety of multiplexed fluorescence and spatial transcriptomic images. In terms of cell type classification, CelloType surpasses a model composed of state-of-the-art methods for individual tasks and a high-performance instance segmentation model. Using multiplexed tissue images, we further demonstrate the utility of CelloType for multiscale segmentation and classification of both cellular and noncellular elements in a tissue. The enhanced accuracy and multitask learning ability of CelloType facilitate automated annotation of rapidly growing spatial omics data.

2024-11-25

A multiomic atlas identifies a treatment-resistant, bone marrow progenitor-like cell population in T cell acute lymphoblastic leukemia

Xu J, Chen C, Sussman JH, Yoshimura S, Vincent T, Pölönen P, Hu J, Bandyopadhyay S, Elghawy O, Yu W, Tumulty J, Chen CH, Li EY, Diorio C, Shraim R, Newman H, Uppuluri L, Li A, Chen GM, Wu DW, Ding YY, Xu JA, Karanfilovski D, Lim T, Hsu M, Thadi A, Ahn KJ, Wu CY, Peng J, Sun Y, Wang A, Mehta R, Frank D, Meyer L, Loh ML, Raetz EA, Chen Z, Wood BL, Devidas M, Dunsmore KP, Winter SS, Chang TC, Wu G, Pounds SB, Zhang NR, Carroll W, Hunger SP, Bernt K, Yang JJ, Mullighan CG, Tan K, Teachey DT

TMC-CHOP

Refractoriness to initial chemotherapy and relapse after remission are the main obstacles to curing T cell acute lymphoblastic leukemia (T-ALL). While tumor heterogeneity has been implicated in treatment failure, the cellular and genetic factors contributing to resistance and relapse remain unknown. Here we linked tumor subpopulations with clinical outcome, created an atlas of healthy pediatric hematopoiesis and applied single-cell multiomic analysis to a diverse cohort of 40 T-ALL cases. We identified a bone marrow progenitor (BMP)-like leukemia subpopulation associated with treatment failure and poor overall survival. The single-cell-derived molecular signature of BMP-like blasts predicted poor outcome across multiple subtypes of T-ALL and revealed that NOTCH1 mutations additively drive T-ALL blasts away from the BMP-like state. Through in silico and in vitro drug screenings, we identified a therapeutic vulnerability of BMP-like blasts to apoptosis-inducing agents including venetoclax. Collectively, our study establishes multiomic signatures for rapid risk stratification and targeted treatment of high-risk T-ALL.

2024-12-04

A Streamlined Workflow for Microscopy-Driven MALDI Imaging Mass Spectrometry Data Collection

Esselman AB, Ward MS, Marshall CR, Pingry EL, Dufresne M, Farrow MA, Schrag M, Spraggins JM

TMC-Vanderbilt (Kidney)

Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) is a rapidly advancing technology for biomedical research. As spatial resolution increases, however, so do acquisition time, file size, and experimental cost, which increases the need to perform precise sampling of targeted tissue regions to optimize the biological information gleaned from an experiment and minimize wasted resources. The ability to define instrument measurement regions based on key tissue features and automatically sample these specific regions of interest (ROIs) addresses this challenge. Herein, we demonstrate a workflow using standard software that allows for direct sampling of microscopy-defined regions by MALDI IMS. Three case studies are included, highlighting different methods for defining features from common sample types─manual annotation of vasculature in human brain tissue, automated segmentation of renal functional tissue units across whole slide images using custom segmentation algorithms, and automated segmentation of dispersed HeLa cells using open-source software. Each case minimizes data acquisition from unnecessary sample regions and dramatically increases throughput while uncovering molecular heterogeneity within targeted ROIs. This workflow provides an approachable method for spatially targeted MALDI IMS driven by microscopy as part of multimodal molecular imaging studies.

2024-12-05

Parallel measurement of transcriptomes and proteomes from same single cells using nanodroplet splitting

Fulcher JM, Markillie LM, Mitchell HD, Williams SM, Engbrecht KM, Degnan DJ, Bramer LM, Moore RJ, Chrisler WB, Cantlon-Bruce J, Bagnoli JW, Qian WJ, Seth A, Paša-Tolić L, Zhu Y

TTD-PNNL

Single-cell multiomics provides comprehensive insights into gene regulatory networks, cellular diversity, and temporal dynamics. Here, we introduce nanoSPLITS (nanodroplet SPlitting for Linked-multimodal Investigations of Trace Samples), an integrated platform that enables global profiling of the transcriptome and proteome from same single cells via RNA sequencing and mass spectrometry-based proteomics, respectively. Benchmarking of nanoSPLITS demonstrates high measurement precision with deep proteomic and transcriptomic profiling of single-cells. We apply nanoSPLITS to cyclin-dependent kinase 1 inhibited cells and found phospho-signaling events could be quantified alongside global protein and mRNA measurements, providing insights into cell cycle regulation. We extend nanoSPLITS to primary cells isolated from human pancreatic islets, introducing an efficient approach for facile identification of unknown cell types and their protein markers by mapping transcriptomic data to existing large-scale single-cell RNA sequencing reference databases. Accordingly, we establish nanoSPLITS as a multiomic technology incorporating global proteomics and anticipate the approach will be critical to furthering our understanding of biological systems.

2024-12-13

Integrative spatial protein profiling with multi-omics

Fan R

TTD-Yale

2024-12-17

Exploring new frontiers in type 1 diabetes through advanced mass-spectrometry-based molecular measurements

Sarkar S, Zheng X, Clair GC, Kwon YM, You Y, Swensen AC, Webb-Robertson BM, Nakayasu ES, Qian WJ, Metz TO

TMC-PNNL

Type 1 diabetes (T1D) is a devastating autoimmune disease for which advanced mass spectrometry (MS) methods are increasingly used to identify new biomarkers and better understand underlying mechanisms. For example, integration of MS analysis and machine learning has identified multimolecular biomarker panels. In mechanistic studies, MS has contributed to the discovery of neoepitopes, and pathways involved in disease development and identifying therapeutic targets. However, challenges remain in understanding the role of tissue microenvironments, spatial heterogeneity, and environmental factors in disease pathogenesis. Recent advancements in MS, such as ultra-fast ion-mobility separations, and single-cell and spatial omics, can play a central role in addressing these challenges. Here, we review recent advancements in MS-based molecular measurements and their role in understanding T1D.