Workshops

Data Exploration of HuBMAP data on Bridges-2: HuBMAP’s scRNA-seq Data Analysis, Venn diagrams, and Gene Ontology Enrichment Analysis (GOEA)

March 20 - 21, 2024
Ana G. Mendez University

This comprehensive, two-day workshop was hosted at the Universidad Ana G. Mendez and offered to members of PR-INBRE Bioinformatics Community of Practice (BiCoP). The workshop was designed for professionals and enthusiasts seeking to deepen their understanding of data exploration and analysis with Python, single-cell RNA Sequencing Data Analysis of data from HuBMAP, Venn diagrams and Gene Ontology Enrichment Analysis (GOEA). 

Agenda: Day 1

Introduction to the resources used for this workshop
What is ACCESS? What is Bridges-2? What is HuBMAP?

Using OnDemand on Bridges-2
HuBMAP Hands-on Data Exploration Exercises
Connecting to Bridges-2 OnDemand
Uploading Jupyter Notebooks
scRNA-seq data analysis
Super Venn diagrams and classical Venn disgrams

Hands On Exercises – scRNA-seq & Venn Diagrams
This section will include exercises on scRNA-seq and Venn diagrams.

scRNA-seq

  • Import a Jupyter notebook on Bridges-2 On Demand to perform the scRNA-seq data analysis
  • Install and load the required Python libraries to conduct the analysis
  • Read HuBMAP's data file for HBM538.PHSC.677
  • Perform Principal Component Analysis (PCA) to reduce the dimensionality of the data
  • Inspect the contribution of single PCs to the total variance in the data through an Elbow plot
  • Compute and cluster the neighborhood graph
  • Find marker genes per cluster using the t-test method, the Wilcoxon rank-sum (Mann-Whitney-U) test, and logistic regression method

See the scRNA-seq hands-on training module.

 

Venn Diagrams

  • Define the lists of marker genes for the clusters from a HuBMAP scRNA-seq data analysis
  • Compare the lists of differentially expressed genes among all the clusters by means of a Super Venn diagram
  • Compare the list of differentially expressed genes from two clusters that share genes in common by means of a classical Venn diagram

See the Venn diagram training hands-on module.

Agenda: Day 2

HuBMAP APIs and other services
numpy vs. Dask (and maybe CuPy)

Hands On Exercises – STRING DB & GOEA
This section will include exercises on using the STRING database and Gene Ontology Enrichment Analysis.

STRING DB

  • Import a Jupyter notebook on Bridges-2 On Demand to perform the scRNA-seq data analysis
  • Install and load the required Python libraries to conduct the analysis
  • Define the lists of marker genes for the clusters from a HuBMAP scRNA-seq data analysis
  • Map the genes of interests to the STRING database IDs
  • Generate the STRING database interaction network image for the dataset and for individual genes
  • Perform the functional enrichment analysis
  • Select the top 10 results for Molecular Function, Biological Process & Cellular Component
  • Generate the Barplot with the significant results from the functional enrichment analysis for Molecular Function, Biological Process, and Cellular Component

See the STRING database hands-on training module.

 

Gene Ontology Enrichment Analysis (GOEA)

  • Import a Jupyter notebook on Bridges 2 On Demand to perform the scRNA-seq Data Analysis.
    Install and load the required Python libraries to conduct the analysis
  • Define the lists of marker genes for the clusters from a HuBMAP scRNA-seq data analysis
  • Obtain and load the background gene set from NCBI
  • Perform the Gene Ontology Enrichment Analysis (GOEA) and generate the Barplot with the significant results from our GOEA

See the GOEA training hands-on module.

 

Questions? Contact us at help@hubmapconsortium.org