Advanced bioinformatics and data analyses across genomics, transcriptomics, and population-level studies.
🖥️ R & Python Programming
Expert-level programming in R and Python for bioinformatics, statistical analysis, and pipeline development, with extensive use of specialized libraries and frameworks.
Key Methods & Techniques:R (Bioconductor, tidyverse, Shiny)
Python (pandas, NumPy, scikit-learn, PyTorch, TensorFlow)
Workflow automation & reproducible research
Scripting for high-performance computing (HPC & cloud)
Key Methods & Techniques:
🧬 Exome & Whole-Genome Sequencing
Comprehensive analysis of coding and non-coding variants to identify clinically relevant findings.
Key Methods & Techniques:Raw data QC, alignment & post-alignment QC
Somatic & germline SNVs, indels, and CNV calling
Variant QC & annotation
Variant interpretation (ACMG/AMP, ASCO/CAP)
HPO-based variant prioritization
Pedigree/segregation analysis, trio analysis, carrier screening
Key Methods & Techniques:
🌐 Genome-Wide Association Studies (GWAS)
Statistical analysis of genetic variants across the genome to identify loci associated with complex traits and diseases.
Key Methods & Techniques:Data QC & preprocessing
Association testing (single variant & gene-based)
Covariate adjustment
Family-based GWAS methods
Key Methods & Techniques:
🧮 Polygenic Risk Scores (PRS)
Development and application of PRS models for predicting disease susceptibility and treatment response.
Key Methods & Techniques:PRS calculation (PRSice, LDpred, PRS-CS)
Cross-ancestry PRS transferability
Validation in independent cohorts
Integration with clinical covariates
Key Methods & Techniques:
📖 De novo Transcriptome Assembly & Annotation
Reference-free reconstruction of transcriptomes for organisms lacking a genome assembly.
Key Methods & Techniques:De novo transcriptome assembly (Trinity, Oases)
Isoform discovery & quantification
Transcriptome completeness assessment (BUSCO)
Functional annotation (BLAST, InterProScan, GO, KEGG)
Key Methods & Techniques:
🧫 Single-Cell RNA Sequencing
High-resolution analysis of gene expression at the single-cell level.
Key Methods & Techniques:Cell type clustering & annotation
Trajectory/pseudotime analysis
Differential expression in single cells
Batch correction & dataset integration
Key Methods & Techniques:
🧩 Bacterial GWAS
Genome-wide association studies in bacteria using diverse genetic features.
Key Methods & Techniques:SNP-based bacterial GWAS
Indels, k-mers, unitigs, orthologous genes
Alignment-based & k-mer–based approaches
Population structure correction
Key Methods & Techniques:
🧪 Epigenomics & Chromatin Analysis
Profiling epigenetic modifications and chromatin accessibility.
Key Methods & Techniques:ChIP-seq, CUT&RUN, CUT&Tag analysis
ATAC-seq for chromatin accessibility
DNA methylation (arrays & WGBS)
Differential methylation analysis
Key Methods & Techniques:
⚙️ Automated Pipeline Development
Design and implementation of reproducible and scalable pipelines for omics data analysis.
Key Methods & Techniques:Nextflow, Snakemake, Cromwell/WDL
Containerization (Docker, Singularity)
Workflow deployment on HPC & cloud systems
Continuous integration & automated testing
Key Methods & Techniques:
💻 Scientific Computing
High-performance and cloud-based computing for large-scale data analysis.
Key Methods & Techniques:Parallel & distributed computing (MPI, Dask, Spark)
HPC job scheduling (SLURM, PBS)
Cloud platforms (AWS, GCP, Azure)
Scalable storage & resource optimization
Key Methods & Techniques:
📊 Data Visualization
Advanced visualization techniques for genomic and biological data using modern plotting libraries and interactive visualization tools.
Key Methods & Techniques:ggplot2, plotly, seaborn, matplotlib
Interactive dashboards (Shiny, Dash, Streamlit)
Genome browser integration
Publication-ready visualizations
Key Methods & Techniques:
🔍 Rare Variant Burden Testing
Statistical methods to evaluate the cumulative effect of rare variants within genes or genomic regions on disease risk.
Key Methods & Techniques:Burden tests, SKAT, SKAT-O
Collapsing methods
Family-based & population-based testing
Key Methods & Techniques:
📈 Post-GWAS Analysis
Downstream functional interpretation of GWAS findings to link genetic associations with biology.
Key Methods & Techniques:Functional annotation of variants
Gene mapping & gene-based association testing
Pathway & gene set enrichment analysis
Meta-analysis
Cross-trait & pleiotropy analysis
Heritability estimation & partitioned heritability
Statistical fine-mapping & functional priors
eQTL & colocalization analysis
Mendelian Randomization (MR)
Visualization (Manhattan plots, LocusZoom)
Key Methods & Techniques:
🎶 RNA-seq Data Analysis
Comprehensive transcriptomic analysis from bulk RNA sequencing data.
Key Methods & Techniques:Raw data QC, alignment, & post-alignment QC
Quantification & differential expression analysis
Pathway enrichment & over-representation analysis
Alternative splicing analysis
Batch correction & removal of unwanted variation
Gene fusion detection
Weighted gene co-expression analysis (WGCNA)
Allele-specific expression
Key Methods & Techniques:
🔗 Multi-Omics Integration
Integrative analysis of genomics, transcriptomics, epigenomics, proteomics, metabolomics and other omics layers to obtain a holistic view of biological systems. This approach helps uncover interactions between different molecular layers, improve disease classification, elucidate regulatory mechanisms, and increase power and resolution beyond single-omics studies.
Key Methods & Techniques:Cross-omics correlation and network construction (co-expression, co-methylation, multi-omics networks)
Multi-view data integration methods (e.g. MOFA, DIABLO, joint NMF)
Dimensionality reduction and latent factor modelling
Regulatory inference (linking epigenetic marks to gene expression and downstream effects)
Multi-omics clustering and subtype discovery
Key Methods & Techniques:
🦠 Microbial Genome Assembly & Annotation
Complete microbial genome assembly, annotation, and comparative genomics.
Key Methods & Techniques:De novo genome assembly & quality control
Genome annotation pipelines
Replicon, integron, transposon, prophage & plasmid identification
Phylogroup & MLST typing
Phylogenetic tree construction
Virulence & antimicrobial resistance gene detection
Stress/heat/salt resistance genes, biofilm genes, heavy metal resistance genes
Key Methods & Techniques:
🌍 16S/18S/ITS & Whole-Genome Metagenomics
Comprehensive microbial community analysis using amplicon and shotgun sequencing.
Key Methods & Techniques:16S, 18S & ITS amplicon analysis
Shotgun metagenomics
Taxonomic profiling (Kraken, MetaPhlAn)
Functional profiling (HUMAnN, eggNOG)
Community structure & diversity metrics
Key Methods & Techniques:
🌳 Phylogenetic Analysis
Evolutionary inference using molecular sequences and genomes.
Key Methods & Techniques:Sequence alignment & phylogenetic tree construction
Molecular clock models
Comparative genomics
Phylogenetic placement & species identification
Key Methods & Techniques:
🤖 Machine Learning
Application of machine learning for predictive modeling in genomics and biomedicine.
Key Methods & Techniques:Supervised & unsupervised learning
Feature selection & dimensionality reduction
Deep learning (CNNs, RNNs, transformers)
Model validation & interpretation (SHAP, LIME)
Key Methods & Techniques: