Trajectory Inference with VIA and scVelo#

When scRNA-velocity is available, it can be used to guide the trajectory inference and automate initial state prediction. However, because RNA velocitycan be misguided by(Bergen 2021) boosts in expression, variable transcription rates and data capture scope limited to steady-state populations only, users might find it useful to adjust the level of influence the RNA-velocity data should exercise on the inferred TI.

Paper: Generalized and scalable trajectory inference in single-cell omics data with VIA

Code: ShobiStassen/VIA

Colab_Reproducibility:https://colab.research.google.com/drive/1MtGr3e9uUb_BWOzKlcbOTiCYsZpljEyF?usp=sharing

import omicverse as ov
import scanpy as sc
import scvelo as scv
import cellrank as cr
ov.utils.ov_plot_set()

Data loading and preprocessing#

We use a familiar endocrine-genesis dataset (Bastidas-Ponce et al. (2019).) to demonstrate initial state prediction at the EP Ngn3 low cells and automatic captures of the 4 differentiated islets (alpha, beta, delta and epsilon). As mentioned, it us useful to control the level of influence of RNA-velocity relative to gene-gene distance and this is done using the velo_weight parameter.

adata = cr.datasets.pancreas()
adata
AnnData object with n_obs × n_vars = 2531 × 27998
    obs: 'day', 'proliferation', 'G2M_score', 'S_score', 'phase', 'clusters_coarse', 'clusters', 'clusters_fine', 'louvain_Alpha', 'louvain_Beta', 'palantir_pseudotime'
    var: 'highly_variable_genes'
    uns: 'clusters_colors', 'clusters_fine_colors', 'day_colors', 'louvain_Alpha_colors', 'louvain_Beta_colors', 'neighbors', 'pca'
    obsm: 'X_pca', 'X_umap'
    layers: 'spliced', 'unspliced'
    obsp: 'connectivities', 'distances'
n_pcs = 30
scv.pp.filter_and_normalize(adata, min_shared_counts=20, n_top_genes=5000)
sc.tl.pca(adata, n_comps = n_pcs)
scv.pp.moments(adata, n_pcs=None, n_neighbors=None)
scv.tl.velocity(adata, mode='stochastic') # good results acheived with mode = 'stochastic' too
Filtered out 22024 genes that are detected 20 counts (shared).
Normalized count data: X, spliced, unspliced.
Extracted 5000 highly variable genes.
Logarithmized X.
computing PCA
    on highly variable genes
    with n_comps=30
    finished (0:00:01)
computing moments based on connectivities
    finished (0:00:00) --> added 
    'Ms' and 'Mu', moments of un/spliced abundances (adata.layers)
computing velocities
    finished (0:00:00) --> added 
    'velocity', velocity vectors for each individual cell (adata.layers)

Initialize and run VIA#

v0 = ov.single.pyVIA(adata=adata,adata_key='X_pca',adata_ncomps=n_pcs, basis='X_umap',
                         clusters='clusters',knn=20, root_user=None,
                         dataset='', random_seed=42,is_coarse=True, preserve_disconnected=True, pseudotime_threshold_TS=50,
                         piegraph_arrow_head_width=0.15,piegraph_edgeweight_scalingfactor=2.5,
                         velocity_matrix=adata.layers['velocity'],gene_matrix=adata.X.todense(),velo_weight=0.5,
                         edgebundle_pruning_twice=False, edgebundle_pruning=0.15, pca_loadings = adata.varm['PCs']
                        )

v0.run()
2023-04-08 16:32:57.976202	Running VIA over input data of 2531 (samples) x 30 (features)
2023-04-08 16:32:57.976308	Knngraph has 20 neighbors
2023-04-08 16:32:58.800324	Finished global pruning of 20-knn graph used for clustering at level of 0.15. Kept 49.3 % of edges. 
2023-04-08 16:32:58.807060	Number of connected components used for clustergraph  is 1
2023-04-08 16:32:59.274711	Commencing community detection
2023-04-08 16:32:59.296206	Finished running Leiden algorithm. Found 67 clusters.
2023-04-08 16:32:59.296818	Merging 50 very small clusters (<10)
2023-04-08 16:32:59.297396	Finished detecting communities. Found 17 communities
2023-04-08 16:32:59.297494	Making cluster graph. Global cluster graph pruning level: 0.15
2023-04-08 16:32:59.300366	Graph has 1 connected components before pruning
2023-04-08 16:32:59.301039	Graph has 1 connected components after pruning
2023-04-08 16:32:59.301105	Graph has 1 connected components after reconnecting
2023-04-08 16:32:59.301317	0.0% links trimmed from local pruning relative to start
2023-04-08 16:32:59.301323	57.4% links trimmed from global pruning relative to start
2023-04-08 16:32:59.302247	Starting make edgebundle viagraph...
2023-04-08 16:32:59.302252	Make via clustergraph edgebundle
2023-04-08 16:33:00.142548	Hammer dims: Nodes shape: (17, 2) Edges shape: (52, 3)
size velocity matrix 2531 (2531, 5000)
2023-04-08 16:33:00.173534	Looking for initial states
2023-04-08 16:33:00.175672	Stationary distribution normed [0.086 0.038 0.181 0.009 0.084 0.027 0.003 0.11  0.028 0.037 0.103 0.097
 0.049 0.031 0.051 0.017 0.048]
2023-04-08 16:33:00.175849	Top 5 candidates for root: [ 6  3 15  5  8 13  9  1 16 12] with stationary prob (%) [0.323 0.932 1.742 2.733 2.78  3.051 3.681 3.762 4.82  4.902]
2023-04-08 16:33:00.175966	Top 5 candidates for terminal: [ 2  7 10 11  0]
2023-04-08 16:33:00.176867	component number 0 out of  [0]
2023-04-08 16:33:00.186598	Using the RNA velocity graph, A top3 candidate for initial state is 6 comprising predominantly of Ngn3 low EP cells
2023-04-08 16:33:00.186997	Using the RNA velocity graph, A top3 candidate for initial state is 3 comprising predominantly of Ngn3 high EP cells
2023-04-08 16:33:00.187360	Using the RNA velocity graph, A top3 candidate for initial state is 15 comprising predominantly of Delta cells
2023-04-08 16:33:00.187716	Using the RNA velocity graph, A top3 candidate for initial state is 5 comprising predominantly of Fev+ cells
2023-04-08 16:33:00.188051	Using the RNA velocity graph, A top3 candidate for initial state is 8 comprising predominantly of Ngn3 high EP cells
2023-04-08 16:33:00.188384	Using the RNA velocity graph, A top3 candidate for initial state is 13 comprising predominantly of Ngn3 high EP cells
2023-04-08 16:33:00.188723	Using the RNA velocity graph, A top3 candidate for initial state is 9 comprising predominantly of Ngn3 high EP cells
2023-04-08 16:33:00.189079	Using the RNA velocity graph, A top3 candidate for initial state is 1 comprising predominantly of Alpha cells
2023-04-08 16:33:00.189405	Using the RNA velocity graph, A top3 candidate for initial state is 16 comprising predominantly of Epsilon cells
2023-04-08 16:33:00.189749	Using the RNA velocity graph, A top3 candidate for initial state is 12 comprising predominantly of Epsilon cells
2023-04-08 16:33:00.189752	Using the RNA velocity graph, the suggested initial root state is 6 comprising predominantly of Ngn3 low EP cells
2023-04-08 16:33:00.189759	Computing lazy-teleporting expected hitting times
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
2023-04-08 16:33:12.895333	Identifying terminal clusters corresponding to unique lineages...
2023-04-08 16:33:12.895502	Closeness:[0, 1, 2, 3, 4, 5, 6, 7, 10]
2023-04-08 16:33:12.895513	Betweenness:[0, 1, 4, 5, 6, 10, 12, 13, 15]
2023-04-08 16:33:12.895516	Out Degree:[0, 1, 2, 3, 4, 5, 6, 7, 10, 12, 15]
2023-04-08 16:33:12.895627	Cluster 0 had 3 or more neighboring terminal states [2, 4, 5, 7] and so we removed cluster 5
2023-04-08 16:33:12.895647	Cluster 2 had 3 or more neighboring terminal states [0, 7, 10] and so we removed cluster 0
2023-04-08 16:33:12.895758	Terminal clusters corresponding to unique lineages in this component are [1, 2, 4, 7, 10, 12] 
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
2023-04-08 16:33:24.271222	From root 6,  the Terminal state 1 is reached 266 times.
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
2023-04-08 16:33:34.962528	From root 6,  the Terminal state 2 is reached 640 times.
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
2023-04-08 16:33:45.675023	From root 6,  the Terminal state 4 is reached 467 times.
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
2023-04-08 16:33:56.019835	From root 6,  the Terminal state 7 is reached 628 times.
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
2023-04-08 16:34:06.545564	From root 6,  the Terminal state 10 is reached 645 times.
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
/Users/fernandozeng/miniforge3/envs/scbasset/lib/python3.8/site-packages/phate/__init__.py
2023-04-08 16:34:17.597466	From root 6,  the Terminal state 12 is reached 237 times.
2023-04-08 16:34:17.614112	Terminal clusters corresponding to unique lineages are {1: 'Alpha', 2: 'Beta', 4: 'Alpha', 7: 'Beta', 10: 'Beta', 12: 'Epsilon'}
2023-04-08 16:34:17.614139	Begin projection of pseudotime and lineage likelihood
2023-04-08 16:34:17.913029	Transition matrix with weight of 0.5 on RNA velocity
2023-04-08 16:34:17.913694	Graph has 1 connected components before pruning
2023-04-08 16:34:17.914918	Graph has 1 connected components after pruning
2023-04-08 16:34:17.914987	Graph has 1 connected components after reconnecting
2023-04-08 16:34:17.915221	40.4% links trimmed from local pruning relative to start
2023-04-08 16:34:17.915231	65.4% links trimmed from global pruning relative to start
2023-04-08 16:34:17.917529	Start making edgebundle milestone...
2023-04-08 16:34:17.917562	Start finding milestones
2023-04-08 16:34:18.189857	End milestones
2023-04-08 16:34:18.189946	Will use via-pseudotime for edges, otherwise consider providing a list of numeric labels (single cell level) or via_object
2023-04-08 16:34:18.193933	Recompute weights
2023-04-08 16:34:18.203565	pruning milestone graph based on recomputed weights
2023-04-08 16:34:18.204081	Graph has 1 connected components before pruning
2023-04-08 16:34:18.204384	Graph has 1 connected components after pruning
2023-04-08 16:34:18.204457	Graph has 1 connected components after reconnecting
2023-04-08 16:34:18.204894	63.9% links trimmed from global pruning relative to start
2023-04-08 16:34:18.204906	regenerate igraph on pruned edges
2023-04-08 16:34:18.207778	Setting numeric label as single cell pseudotime for coloring edges
2023-04-08 16:34:18.213883	Making smooth edges
2023-04-08 16:34:18.577366	Time elapsed 80.3 seconds
fig, ax, ax1 = v0.plot_piechart_graph(clusters='clusters',cmap='Reds',dpi=80,
                                   show_legend=False,ax_text=False,fontsize=4)
fig.set_size_inches(8,4)

Visualize trajectory and cell progression#

Fine grained vector fields

v0.plot_trajectory_gams(basis='X_umap',clusters='clusters',draw_all_curves=False)
2023-04-08 16:35:44.495793	Super cluster 1 is a super terminal with sub_terminal cluster 1
2023-04-08 16:35:44.495818	Super cluster 2 is a super terminal with sub_terminal cluster 2
2023-04-08 16:35:44.495824	Super cluster 4 is a super terminal with sub_terminal cluster 4
2023-04-08 16:35:44.495830	Super cluster 7 is a super terminal with sub_terminal cluster 7
2023-04-08 16:35:44.495836	Super cluster 10 is a super terminal with sub_terminal cluster 10
2023-04-08 16:35:44.495841	Super cluster 12 is a super terminal with sub_terminal cluster 12
(<Figure size 640x320 with 3 Axes>,
 <AxesSubplot: title={'center': 'True Labels: ncomps:30. knn:20'}>,
 <AxesSubplot: title={'center': 'Pseudotime'}>)
../_images/24d3f5f6faaef5ccec0992cd0eb1a6b227cbeae4aef7c916b0737fc3cb93a4b8.png
v0.plot_stream(basis='X_umap',clusters='clusters',
               density_grid=0.8, scatter_size=30, scatter_alpha=0.3, linewidth=0.5)
(<Figure size 320x320 with 1 Axes>,
 <AxesSubplot: title={'center': 'Streamplot'}>)
../_images/01b60ebf2b191fab824792ed03e1ab38939429c7435e264da215e431e20873d3.png

Draw lineage likelihoods#

These indicate potential pathways corresponding to the 4 islets (two types of Beta islets Lineage 5 and 12)

v0.plot_lineage_probability()
2023-04-08 16:36:18.551874	Marker_lineages: [1, 2, 4, 7, 10, 12]
2023-04-08 16:36:18.561868	The number of components in the original full graph is 1
2023-04-08 16:36:18.561916	For downstream visualization purposes we are also constructing a low knn-graph 
2023-04-08 16:36:19.459300	Check sc pb [0.08  0.261 0.209 0.194 0.18  0.076]
2023-04-08 16:36:19.494450	Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 1: [6, 3, 8, 9, 16, 12, 1]
2023-04-08 16:36:19.494481	Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 2: [6, 3, 8, 9, 16, 11, 7, 2]
2023-04-08 16:36:19.494486	Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 4: [6, 3, 8, 13, 5, 0, 4]
2023-04-08 16:36:19.494491	Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 7: [6, 3, 8, 9, 16, 11, 7]
2023-04-08 16:36:19.494495	Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 10: [6, 3, 8, 9, 16, 11, 7, 2, 10]
2023-04-08 16:36:19.494499	Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 12: [6, 3, 8, 9, 16, 12]
2023-04-08 16:36:19.805794	Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 1 along path: [6, 6, 6, 6, 3, 9, 16, 12, 1, 1, 1]
2023-04-08 16:36:19.816800	Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 2 along path: [6, 6, 6, 6, 3, 9, 16, 2, 2, 2, 2, 2, 2]
2023-04-08 16:36:19.827095	Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 4 along path: [6, 6, 6, 6, 3, 9, 16, 11, 4, 4, 4]
2023-04-08 16:36:19.837298	Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 7 along path: [6, 6, 6, 6, 3, 9, 16, 12, 4, 7]
2023-04-08 16:36:19.847061	Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 10 along path: [6, 6, 6, 6, 3, 9, 16, 2, 10, 10, 10, 10, 10]
2023-04-08 16:36:19.856823	Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 12 along path: [6, 6, 6, 6, 3, 9, 16, 12, 12, 12, 12]
(<Figure size 640x320 with 14 Axes>,
 array([[<AxesSubplot: title={'center': 'Lineage: 1-Alpha'}>,
         <AxesSubplot: title={'center': 'Lineage: 2-Beta'}>,
         <AxesSubplot: title={'center': 'Lineage: 4-Alpha'}>,
         <AxesSubplot: title={'center': 'Lineage: 7-Beta'}>],
        [<AxesSubplot: title={'center': 'Lineage: 10-Beta'}>,
         <AxesSubplot: title={'center': 'Lineage: 12-Epsilon'}>,
         <AxesSubplot: >, <AxesSubplot: >]], dtype=object))
../_images/fd27620e9692821d8d46b04836803c7e84d5f888268016c5a125b0a3abfc57b8.png