Workshops
Data Exploration of HuBMAP data on Bridges-2: HuBMAP’s scRNA-seq Data Analysis, Venn diagrams, and Gene Ontology Enrichment Analysis (GOEA)
March 20 - 21, 2024
Ana G. Mendez University
This comprehensive, two-day workshop was hosted at the Universidad Ana G. Mendez and offered to members of PR-INBRE Bioinformatics Community of Practice (BiCoP). The workshop was designed for professionals and enthusiasts seeking to deepen their understanding of data exploration and analysis with Python, single-cell RNA Sequencing Data Analysis of data from HuBMAP, Venn diagrams and Gene Ontology Enrichment Analysis (GOEA).
Agenda: Day 1
Introduction to the resources used for this workshop |
What is ACCESS? What is Bridges-2? What is HuBMAP? Using OnDemand on Bridges-2 |
HuBMAP Hands-on Data Exploration Exercises |
Connecting to Bridges-2 OnDemand Uploading Jupyter Notebooks scRNA-seq data analysis Super Venn diagrams and classical Venn disgrams |
Hands On Exercises – scRNA-seq & Venn Diagrams
This section will include exercises on scRNA-seq and Venn diagrams.
- Import a Jupyter notebook on Bridges-2 On Demand to perform the scRNA-seq data analysis
- Install and load the required Python libraries to conduct the analysis
- Read HuBMAP's data file for HBM538.PHSC.677
- Perform Principal Component Analysis (PCA) to reduce the dimensionality of the data
- Inspect the contribution of single PCs to the total variance in the data through an Elbow plot
- Compute and cluster the neighborhood graph
- Find marker genes per cluster using the t-test method, the Wilcoxon rank-sum (Mann-Whitney-U) test, and logistic regression method
See the scRNA-seq hands-on training module.
- Define the lists of marker genes for the clusters from a HuBMAP scRNA-seq data analysis
- Compare the lists of differentially expressed genes among all the clusters by means of a Super Venn diagram
- Compare the list of differentially expressed genes from two clusters that share genes in common by means of a classical Venn diagram
Agenda: Day 2
HuBMAP APIs and other services |
numpy vs. Dask (and maybe CuPy) |
Hands On Exercises – STRING DB & GOEA
This section will include exercises on using the STRING database and Gene Ontology Enrichment Analysis.
- Import a Jupyter notebook on Bridges-2 On Demand to perform the scRNA-seq data analysis
- Install and load the required Python libraries to conduct the analysis
- Define the lists of marker genes for the clusters from a HuBMAP scRNA-seq data analysis
- Map the genes of interests to the STRING database IDs
- Generate the STRING database interaction network image for the dataset and for individual genes
- Perform the functional enrichment analysis
- Select the top 10 results for Molecular Function, Biological Process & Cellular Component
- Generate the Barplot with the significant results from the functional enrichment analysis for Molecular Function, Biological Process, and Cellular Component
See the STRING database hands-on training module.
Gene Ontology Enrichment Analysis (GOEA)
- Import a Jupyter notebook on Bridges 2 On Demand to perform the scRNA-seq Data Analysis.
Install and load the required Python libraries to conduct the analysis - Define the lists of marker genes for the clusters from a HuBMAP scRNA-seq data analysis
- Obtain and load the background gene set from NCBI
- Perform the Gene Ontology Enrichment Analysis (GOEA) and generate the Barplot with the significant results from our GOEA
Questions? Contact us at help@hubmapconsortium.org