Workshop Information

Topics

The format of the workshop includes a combination of short instructional videos, special guest lectures, real-data analyses practices and live demonstration sessions. The workshop aims to facilitate the learning of practical bioinformatics skills and build familiarity and basic competency. Using established tools and publicly available resources, we will focus on the analyses and interpretation of genomic and genetic data, making it more suitable for researchers with limited big data analytical skills.

Session 0: Introduction to R programming

This session introduces the basics of R programming language and using Bioconductor packages. Participants will learn basic syntax, data structures, visualization and package configuration in R environment with hands-on exercises.

Reference: Dr. Sean Davis's R BiocBook

Session 1: Microbiome Analysis

This session focuses on a case study of microbiome data analysis, using the latest version of R with microbiome-focus packages in Bioconductor and CRAN. Participants will learn how to import data from curated datasets and understanding the specialized data structure for storing micobiome data in R.

Data: curated microbiome data from Bioconductor packges curatedMetagenomicData and the microbiomeDataSets package.

Session 2. Metabolomics Analysis

This session will introduce several case studies featuring both targeted and untargeted metabolomics data. Through lectures on key concepts and analysis workflows, participants will learn data preprocessing and quality control using real LC-MS and NMR datasets. Additionally, they will explore case-control metabolomics studies through the application of statistical methods.

Data used for this session are from public metabomics data repositories Metabolomics Workbench: Untargeted urine LC-HRMS metabolomics profiling for bladder cancer binary outcome classification

Session 3: Transcriptomic Analyses

This session covers a complete analysis workflow for bulk and single-cell RNA-Seq, and an introduction to spatial transcriptomics. It will cover experimental design, quality control, read mapping, differential expression analyses, as well as pathway and enrichment analyses. The single-cell RNA-seq part will also cover methods for unsupervised clustering and detection of subpopulations of cells.

Participants will work on example datasets using a high-performance computing (HPC) environment on Anvil supercomputer, with combined Unix command-line (bash) tools and R packages in a Jupyter notebook interface. Signature bash tools for hands-on analysis of this session include SRA toolkit, STAR Aligner, and signature R tools include edgeR and Seurat.

Example datasets used for this session are from GEO database and are part of the studies from Yun, et al., Oncotarget., 2017 and Vickman, et al., The Prostate, 2019.

Session 4: Epigenomic Analyses

This session covers epigenomics data analysis from ChIP-seq and ATAC-seq data. Participants will analyze example epigenomic datasets, using command-line tools for preprocessing and peak or accessibility calling, followed by R/Bioconductor packages in Jupyter to integrate signal tracks, perform quality checks, and relate epigenomic signatures to functional genomic regions. Example tools used in this session include bash tools, such as Bowtie2, and R packages, such as GenomicAlignments, DiffBind, TFBSTools, rGADEM, etc.

Example dataset used in this session is part of the study of Alvarez-Benayas, et al, Nat Commun, 2021.

Session 5: Association, Causality and Gene Regulation

In this session, we will introduce the basic concepts and general ideas in performing gene association studies and constructing gene regulatory networks (GRNs). We will focus on genomics studies for integrative analysis of transcriptomic and genomic data. With a case study in cancer research, we will go through cis-eQTL analysis and a state-of-the-art parallel algorithm 2SPLS to construct genome-wide GRNs. In addition, lectures will cover exploring the results by popular bioinformatics tools including STRING and Ingenuity Pathway Analysis. The software used in this session is SIGNET.

Workshop Schedule

(We will update the schedule for 2026 summer once it's available. Please check back later.)

Previous Schedules

APPLY TODAY