Workshop-AnaGMendez-March-2024 – HuBMAP Consortium

Workshops

Data Exploration of HuBMAP data on Bridges-2: HuBMAP’s scRNA-seq Data Analysis, Venn diagrams, and Gene Ontology Enrichment Analysis (GOEA)

March 20 - 21, 2024

Ana G. Mendez University

Universidad Ana G. Mendez

This comprehensive, two-day workshop was hosted at the Universidad Ana G. Mendez and offered to members of PR-INBRE Bioinformatics Community of Practice (BiCoP). The workshop was designed for professionals and enthusiasts seeking to deepen their understanding of data exploration and analysis with Python, single-cell RNA Sequencing Data Analysis of data from HuBMAP, Venn diagrams and Gene Ontology Enrichment Analysis (GOEA).

Agenda: Day 1

Introduction to the resources used for this workshop

What is ACCESS? What is Bridges-2? What is HuBMAP?

Using OnDemand on Bridges-2

HuBMAP Hands-on Data Exploration Exercises

Connecting to Bridges-2 OnDemand
Uploading Jupyter Notebooks
scRNA-seq data analysis
Super Venn diagrams and classical Venn disgrams

Hands On Exercises – scRNA-seq & Venn Diagrams
This section will include exercises on scRNA-seq and Venn diagrams.

scRNA-seq

Import a Jupyter notebook on Bridges-2 On Demand to perform the scRNA-seq data analysis
Install and load the required Python libraries to conduct the analysis
Read HuBMAP's data file for HBM538.PHSC.677
Perform Principal Component Analysis (PCA) to reduce the dimensionality of the data
Inspect the contribution of single PCs to the total variance in the data through an Elbow plot
Compute and cluster the neighborhood graph
Find marker genes per cluster using the t-test method, the Wilcoxon rank-sum (Mann-Whitney-U) test, and logistic regression method

See the scRNA-seq hands-on training module.

Venn Diagrams

Define the lists of marker genes for the clusters from a HuBMAP scRNA-seq data analysis
Compare the lists of differentially expressed genes among all the clusters by means of a Super Venn diagram
Compare the list of differentially expressed genes from two clusters that share genes in common by means of a classical Venn diagram

See the Venn diagram training hands-on module.

Agenda: Day 2

HuBMAP APIs and other services

numpy vs. Dask (and maybe CuPy)

Hands On Exercises – STRING DB & GOEA
This section will include exercises on using the STRING database and Gene Ontology Enrichment Analysis.

STRING DB

Import a Jupyter notebook on Bridges-2 On Demand to perform the scRNA-seq data analysis
Install and load the required Python libraries to conduct the analysis
Define the lists of marker genes for the clusters from a HuBMAP scRNA-seq data analysis
Map the genes of interests to the STRING database IDs
Generate the STRING database interaction network image for the dataset and for individual genes
Perform the functional enrichment analysis
Select the top 10 results for Molecular Function, Biological Process & Cellular Component
Generate the Barplot with the significant results from the functional enrichment analysis for Molecular Function, Biological Process, and Cellular Component

See the STRING database hands-on training module.

Gene Ontology Enrichment Analysis (GOEA)

Import a Jupyter notebook on Bridges 2 On Demand to perform the scRNA-seq Data Analysis.
Install and load the required Python libraries to conduct the analysis
Define the lists of marker genes for the clusters from a HuBMAP scRNA-seq data analysis
Obtain and load the background gene set from NCBI
Perform the Gene Ontology Enrichment Analysis (GOEA) and generate the Barplot with the significant results from our GOEA

See the GOEA training hands-on module.

Questions? Contact us at help@hubmapconsortium.org