Single-Cell RNA-Seq Analysis Framework
A comprehensive Python framework for single-cell RNA-Seq data analysis including preprocessing, clustering, differential expression, and trajectory analysis.
Single-Cell RNA-Seq Analysis Framework
A comprehensive Python framework for single-cell RNA-Seq data analysis, providing streamlined workflows for preprocessing, clustering, differential expression, and trajectory analysis with support for multiple data formats and analysis methods.
Overview
The scRNA-Seq Analysis Framework simplifies complex single-cell analysis workflows by providing a unified interface for common tasks. It integrates popular tools like Scanpy and Seurat into easy-to-use Python functions with consistent APIs.
Key Features
- Multi-Format Support: Read 10x Genomics, Drop-seq, CEL-seq data
- Preprocessing: Quality control, normalization, feature selection
- Dimensionality Reduction: PCA, UMAP, t-SNE
- Clustering: Leiden, Louvain, K-means algorithms
- Differential Expression: Scanpy, Seurat, edgeR integration
- Trajectory Analysis: PAGA, Monocle, Slingshot
- Visualization: Publication-ready plots and interactive dashboards
Core Modules
1. Data Loading and Preprocessing
from scrnaseq import preprocessing
# Load data from various formats
adata = preprocessing.load_data(
data_dir='data/10x_matrix',
format='10x'
)
# Quality control
adata = preprocessing.quality_control(
adata,
min_genes=200,
max_genes=5000,
mt_threshold=10
)
# Normalization and scaling
adata = preprocessing.normalize_data(
adata,
method='log1p'
)
2. Dimensionality Reduction and Clustering
from scrnaseq import clustering
# Dimensionality reduction
adata = clustering.dimensionality_reduction(
adata,
n_pcs=50,
n_neighbors=15
)
# Clustering
adata = clustering.cluster_cells(
adata,
resolution=0.8,
algorithm='leiden'
)
# Visualization
clustering.plot_umap(adata, color='leiden')
3. Differential Expression Analysis
from scrnaseq import differential_expression
# Find marker genes
markers = differential_expression.find_markers(
adata,
groupby='leiden',
method='wilcoxon'
)
# Compare conditions
de_results = differential_expression.compare_conditions(
adata,
groupby='condition',
reference='control'
)
4. Trajectory Analysis
from scrnaseq import trajectory
# PAGA trajectory inference
adata = trajectory.paga_trajectory(
adata,
groups='leiden'
)
# Plot trajectory
trajectory.plot_paga(adata)
Installation
# Clone the repository
git clone https://github.com/tamoghnadas12/scRNA-seq-framework
cd scRNA-seq-framework
# Create conda environment
conda env create -f environment.yml
conda activate scrnaseq-framework
# Install package
pip install -e .
Usage Examples
Complete Analysis Workflow
import scrnaseq as sc
# Initialize analysis
analysis = sc.ScRNASeqAnalysis(
data_dir='data/10x_matrix',
output_dir='results'
)
# Run complete pipeline
analysis.preprocess_data()
analysis.cluster_cells(resolution=0.8)
analysis.find_markers()
analysis.run_trajectory_analysis()
# Generate report
analysis.generate_report()
Custom Analysis
# Load pre-processed data
adata = sc.read_h5ad('results/processed_data.h5ad')
# Custom clustering with different parameters
sc.tl.leiden(adata, resolution=1.2)
sc.pl.umap(adata, color=['leiden', 'cell_type'], ncols=2)
# Differential expression for specific clusters
de_genes = sc.get.rank_genes_groups_df(adata, group='0')
Technologies Integrated
- Core: Scanpy, Anndata
- Clustering: Leiden, Louvain, Scikit-learn
- Visualization: UMAP, t-SNE, Matplotlib, Seaborn
- Trajectory: PAGA, Palantir, Slingshot
- Differential Expression: Scanpy, Seurat, edgeR
- Data Formats: 10x Genomics, Drop-seq, CEL-seq
Interactive Demo
Check out our interactive demo to see the framework in action with real single-cell data.
Documentation
Contributing
We welcome contributions from the community! Please read our contributing guidelines for details on how to get involved.
License
This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.
Continue Learning
One Small Win
Try this quick command to get started:
Copy and paste this into your terminal to get started immediately.
Related Content
Metagenomics Analysis Toolkit
A comprehensive toolkit for metagenomics data analysis including taxonomic profiling, functional annotation, and diversity analysis.
Proteomics Data Analysis Pipeline
A comprehensive pipeline for mass spectrometry-based proteomics data analysis including identification, quantification, and statistical analysis.
RNA-Seq Analysis Pipeline
A complete Snakemake pipeline for RNA-Seq data analysis from raw FASTQ files to differential expression results with MultiQC reporting.
Variant Calling Pipeline
A robust Snakemake pipeline for germline and somatic variant calling from whole-genome and whole-exome sequencing data.
Start Your Own Project
Use our battle-tested template to jumpstart your reproducible research workflows. Pre-configured environments, standardized structure, and example workflows included.
Use This Templategit clone https://github.com/Tamoghna12/bench2bash-starter
cd bench2bash-starter
conda env create -f env.yml
make run